INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     berbagai
    -0.09
     большин
    -0.09
     современные
    -0.09
     исполнитель
    -0.08
    phäre
    -0.08
     представители
    -0.08
    stöðu
    -0.08
     bruke
    -0.08
    -0.08
     կողմ
    -0.08
    POSITIVE LOGITS
     wanting
    0.11
     we're
    0.10
     you're
    0.09
     wants
    0.09
     vậy
    0.09
    正在
    0.08
     chce
    0.08
     asking
    0.08
     someone's
    0.08
     I'm
    0.08
    Act Density 0.015%

    No Known Activations