INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     abi
    -0.08
     sexuality
    -0.08
    .speed
    -0.07
    jai
    -0.07
    _IPV
    -0.07
     speed
    -0.07
    Much
    -0.07
     seksuele
    -0.07
    -0.07
     pace
    -0.07
    POSITIVE LOGITS
     بريد
    0.09
    0.08
     أرب
    0.08
     وو
    0.08
     رسید
    0.07
    0.07
     tst
    0.07
     scint
    0.07
    0.07
     Презид
    0.07
    Act Density 0.003%

    No Known Activations