INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     verse
    -0.06
     Johannesburg
    -0.06
     clipped
    -0.06
    FTA
    -0.06
     six
    -0.06
     مراجع
    -0.06
     thr
    -0.05
    _sat
    -0.05
    finity
    -0.05
    hawk
    -0.05
    POSITIVE LOGITS
      
    0.08
     lasc
    0.08
     thừa
    0.07
    0.07
     Spacer
    0.07
    .ColumnName
    0.07
    .engine
    0.07
    _DISABLE
    0.06
     _↵
    0.06
     Raptors
    0.06
    Act Density 0.002%

    No Known Activations