INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     like
    -0.84
     obti
    -0.80
    zijn
    -0.75
     られる
    -0.74
     gaji
    -0.73
     Scottish
    -0.72
     everyone
    -0.70
     several
    -0.70
    cozy
    -0.70
     português
    -0.69
    POSITIVE LOGITS
    Indiana
    1.02
     rentable
    0.94
     Indiana
    0.93
     INDIANA
    0.84
     Purdue
    0.82
    lename
    0.80
    SCRIBE
    0.79
    carril
    0.79
    ERM
    0.78
    JMenuItem
    0.77
    Act Density 0.001%

    No Known Activations