INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    indice
    -0.07
    addy
    -0.07
     miraculous
    -0.06
     multin
    -0.06
    -Z
    -0.06
    (digits
    -0.06
     Bez
    -0.06
     karena
    -0.06
    _with
    -0.06
     iq
    -0.06
    POSITIVE LOGITS
    _ELEM
    0.07
     anale
    0.07
     advis
    0.06
    0.06
    QUENCY
    0.06
    UAGE
    0.06
     місцев
    0.06
    	module
    0.06
     ideological
    0.06
    ngr
    0.06
    Act Density 0.024%

    No Known Activations