INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     pj
    -0.08
     зл
    -0.08
     Horr
    -0.08
    ус
    -0.08
    (tv
    -0.07
     compon
    -0.07
     homolog
    -0.07
     нами
    -0.07
     компонент
    -0.07
    (ht
    -0.07
    POSITIVE LOGITS
     lại
    0.09
     itib
    0.08
    Again
    0.08
    Img
    0.08
     again
    0.08
     nochmals
    0.08
    一下
    0.08
     wiederum
    0.08
    ynam
    0.07
     tekrar
    0.07
    Act Density 0.016%

    No Known Activations