INDEX
    Explanations

    performance

    New Auto-Interp
    Negative Logits
     genel
    -0.06
     rage
    -0.06
    	all
    -0.06
     гр
    -0.06
    sand
    -0.06
    <H
    -0.06
    =r
    -0.06
     eros
    -0.06
    <object
    -0.06
    -0.06
    POSITIVE LOGITS
    Oh
    0.07
    ilen
    0.07
    كام
    0.07
     wound
    0.07
     Taste
    0.07
    (MouseEvent
    0.06
    при
    0.06
    дать
    0.06
    uten
    0.06
    (User
    0.06
    Act Density 0.003%

    No Known Activations