INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     допомоги
    -0.07
    -0.06
     zam
    -0.06
    .directive
    -0.06
     Planning
    -0.06
    School
    -0.06
     Zuk
    -0.06
    Determin
    -0.06
    damage
    -0.06
    linear
    -0.05
    POSITIVE LOGITS
     telev
    0.08
     Yüz
    0.08
     Scottish
    0.07
    tvrt
    0.07
    ~":"
    0.07
     сни
    0.06
     plage
    0.06
    щая
    0.06
    -vector
    0.06
    iembre
    0.06
    Act Density 0.018%

    No Known Activations