INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Gor
    -0.09
     antar
    -0.08
     tril
    -0.08
    Tel
    -0.08
     existência
    -0.08
    Então
    -0.07
    Após
    -0.07
     sheriff
    -0.07
    ร้อง
    -0.07
    _PROCESS
    -0.07
    POSITIVE LOGITS
     веществ
    0.08
     sustancias
    0.08
     పద
    0.08
    affeine
    0.08
    _ratio
    0.08
     altre
    0.08
     cią
    0.08
     foods
    0.08
     potatoes
    0.07
     воздействия
    0.07
    Act Density 0.003%

    No Known Activations