INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     residence
    -0.07
    (am
    -0.07
     crusher
    -0.06
     borrow
    -0.06
     coached
    -0.06
     attacked
    -0.06
    hamster
    -0.06
     talks
    -0.06
     бак
    -0.06
     μας
    -0.06
    POSITIVE LOGITS
     international
    0.07
     ист
    0.06
    üy
    0.06
    .Def
    0.06
    pb
    0.06
    خصوص
    0.06
    ภาพยนตร
    0.06
     oct
    0.06
    egral
    0.06
     wealthiest
    0.06
    Act Density 0.009%

    No Known Activations