INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    тися
    -0.07
    лати
    -0.07
    lda
    -0.06
     reign
    -0.06
    محمد
    -0.06
    acs
    -0.06
     USD
    -0.06
    'nda
    -0.06
    -0.06
    -0.06
    POSITIVE LOGITS
     aVar
    0.07
     erotik
    0.07
     Vertical
    0.06
    ">'.
    0.06
     Valid
    0.06
     Alman
    0.06
     корист
    0.06
     terminate
    0.06
    ตำแหน
    0.06
    arshal
    0.06
    Act Density 0.003%

    No Known Activations