INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Esp
    -0.07
    	Get
    -0.06
     Bett
    -0.06
     Armen
    -0.06
     mig
    -0.06
     hydr
    -0.06
    -0.06
     Warn
    -0.06
     bel
    -0.06
     bij
    -0.06
    POSITIVE LOGITS
     EMAIL
    0.07
    ::*
    0.07
    ISHED
    0.07
    (Chat
    0.07
    ائد
    0.07
    teachers
    0.07
    (rate
    0.07
    olah
    0.07
    0.06
    0.06
    Act Density 0.001%

    No Known Activations