INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Koh
    -0.06
    _quota
    -0.06
    ока
    -0.06
    -inst
    -0.06
    RX
    -0.06
    اءات
    -0.06
    _bn
    -0.06
    Inf
    -0.06
    .SetKeyName
    -0.05
     Initialization
    -0.05
    POSITIVE LOGITS
    .Open
    0.07
     Pen
    0.07
    _adj
    0.07
     Sweden
    0.07
     javascript
    0.07
    .lb
    0.06
     whore
    0.06
     github
    0.06
    "&
    0.06
     #__
    0.06
    Act Density 0.001%

    No Known Activations