INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Hiring
    -0.08
     positiv
    -0.08
     dân
    -0.08
     Coaching
    -0.08
     outreach
    -0.08
    બંધ
    -0.08
     publica
    -0.08
     યોજના
    -0.08
     вакцина
    -0.08
     announces
    -0.08
    POSITIVE LOGITS
     Deleted
    0.11
     trash
    0.11
    deleted
    0.10
    Trash
    0.10
    discard
    0.10
     Trash
    0.10
    Deleted
    0.09
    trash
    0.09
     recycle
    0.09
    Recycle
    0.09
    Act Density 0.002%

    No Known Activations