INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     IGNORE
    -0.07
    -0.07
     homic
    -0.06
     rugs
    -0.06
    _social
    -0.06
    ись
    -0.06
    hor
    -0.06
    .Marker
    -0.06
    _mtime
    -0.06
    li
    -0.06
    POSITIVE LOGITS
     swell
    0.08
     swollen
    0.07
    ULE
    0.07
    145
    0.07
     biến
    0.06
     nevid
    0.06
     Appalach
    0.06
    DY
    0.06
     terribly
    0.06
     inflate
    0.06
    Act Density 0.007%

    No Known Activations