INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     preventiva
    -0.09
     Ner
    -0.09
     bookkeeping
    -0.08
     Nb
    -0.08
    فته
    -0.08
     asupra
    -0.08
     Nel
    -0.08
     Nell
    -0.08
     NB
    -0.07
     Kirchen
    -0.07
    POSITIVE LOGITS
    Tokyo
    0.07
    560
    0.07
    Elev
    0.07
    ="'.
    0.07
    INST
    0.07
     nij
    0.07
     p
    0.07
     adults
    0.07
    .endpoint
    0.07
     achieving
    0.07
    Act Density 0.003%

    No Known Activations