INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     swing
    -0.08
    imony
    -0.08
    lical
    -0.07
    etje
    -0.07
    دية
    -0.07
    itant
    -0.07
     imb
    -0.07
     dissolved
    -0.07
     лицо
    -0.07
     touch
    -0.07
    POSITIVE LOGITS
     Cris
    0.10
     comprend
    0.08
     Perse
    0.08
     Crisp
    0.08
     crisp
    0.08
    ومان
    0.07
     ধরে
    0.07
     separation
    0.07
     lebens
    0.07
     Ivan
    0.07
    Act Density 0.001%

    No Known Activations