INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    fen
    -0.09
     والو
    -0.09
     TRE
    -0.08
     Aya
    -0.08
     gant
    -0.08
    šla
    -0.08
     поступ
    -0.08
    liter
    -0.08
     zwe
    -0.08
     деся
    -0.08
    POSITIVE LOGITS
     deficits
    0.08
     Watson
    0.08
     ca
    0.07
     kissing
    0.07
    0.07
     parag
    0.07
     metus
    0.07
     gang
    0.07
    0.07
     curl
    0.07
    Act Density 0.059%

    No Known Activations