INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     JE
    -0.06
     Contact
    -0.06
    \Unit
    -0.06
     owe
    -0.06
    идент
    -0.06
     apology
    -0.06
    owy
    -0.06
     discourse
    -0.06
     arrogance
    -0.06
     Yer
    -0.06
    POSITIVE LOGITS
     الآ
    0.07
    GORITH
    0.07
    .commons
    0.06
     عالية
    0.06
    Outside
    0.06
    _aes
    0.06
     Personen
    0.06
     codecs
    0.06
    Surv
    0.06
    /events
    0.06
    Act Density 2.116%

    No Known Activations