INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     memes
    -0.08
    няти
    -0.07
     участ
    -0.06
     Discussion
    -0.06
    _mem
    -0.06
     computers
    -0.06
     SOCIAL
    -0.06
     scrolling
    -0.06
     Semantic
    -0.06
     progress
    -0.06
    POSITIVE LOGITS
     night
    0.09
     NIGHT
    0.08
     Night
    0.08
     noche
    0.06
     ніч
    0.06
    -remove
    0.06
    řád
    0.06
    getMockBuilder
    0.06
     ráno
    0.06
    0.06
    Act Density 0.024%

    No Known Activations