INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    فهوم
    -0.06
     declare
    -0.06
     --
    -0.06
     shovel
    -0.06
     áo
    -0.06
    hodob
    -0.06
     Dropdown
    -0.06
     Plain
    -0.06
     rumor
    -0.06
    forgot
    -0.06
    POSITIVE LOGITS
     Wel
    0.08
     مستق
    0.07
    0.07
     iets
    0.07
     worsening
    0.06
     derece
    0.06
     перш
    0.06
     applicationContext
    0.06
     advance
    0.06
    _BP
    0.06
    Act Density 0.002%

    No Known Activations