INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     curse
    -0.06
    ascar
    -0.06
    alah
    -0.06
     Mayor
    -0.06
    шую
    -0.06
    -0.06
    пеки
    -0.06
     average
    -0.06
     şekl
    -0.06
     حافظه
    -0.06
    POSITIVE LOGITS
    otide
    0.14
    Eb
    0.07
     t
    0.07
    _t
    0.07
    TRAN
    0.07
    (nt
    0.07
    	T
    0.07
     T
    0.07
     otp
    0.07
    τ
    0.07
    Act Density 0.002%

    No Known Activations