INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     rod
    -0.08
     depleted
    -0.07
     Rockstar
    -0.07
    -0.07
    -0.07
     Brooklyn
    -0.07
     ок
    -0.07
     cripple
    -0.07
     разв
    -0.07
     unst
    -0.07
    POSITIVE LOGITS
     nemen
    0.10
    -Min
    0.09
    /min
    0.09
     lex
    0.08
     بذ
    0.08
    |min
    0.08
    -min
    0.08
     लेने
    0.08
    imum
    0.08
    _bool
    0.07
    Act Density 0.013%

    No Known Activations