INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     спрос
    -0.06
    raç
    -0.06
     đảm
    -0.06
     Scot
    -0.06
     thankful
    -0.06
    ChangeListener
    -0.06
     pinnacle
    -0.06
     Hanna
    -0.06
     devis
    -0.06
     Peters
    -0.06
    POSITIVE LOGITS
     pir
    0.07
    ish
    0.07
    kh
    0.07
    0.06
     &#
    0.06
     SP
    0.06
     Sep
    0.06
     행동
    0.06
     %
    0.06
     Мих
    0.06
    Act Density 0.000%

    No Known Activations