INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     facile
    -0.06
    .private
    -0.06
    ruh
    -0.06
     пти
    -0.06
     apopt
    -0.06
     Ru
    -0.06
    	util
    -0.06
     Unicorn
    -0.06
     goed
    -0.06
    POSITIVE LOGITS
    (before
    0.08
    IFY
    0.07
    pollo
    0.07
     Drawer
    0.07
     suy
    0.06
     dummy
    0.06
     آذ
    0.06
    Tahoma
    0.06
    0.06
     Witnesses
    0.06
    Act Density 0.006%

    No Known Activations