INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Sizes
    -0.08
     размеры
    -0.08
     Schwartz
    -0.08
    !--
    -0.07
     Esc
    -0.07
    grö
    -0.07
     Fire
    -0.07
     Aman
    -0.07
     Chambers
    -0.07
     Languages
    -0.07
    POSITIVE LOGITS
    been
    0.09
     battered
    0.08
     hum
    0.08
    0.08
     hardworking
    0.08
    oton
    0.08
     tired
    0.08
     hålla
    0.08
     boredom
    0.08
    Enough
    0.07
    Act Density 0.009%

    No Known Activations