INDEX
Explanations
specific threats or coercive language
New Auto-Interp
Negative Logits
lawsuits
-0.17
_FT
-0.17
Ì£
-0.16
ouro
-0.15
ÏĩÏİ
-0.15
Uncomment
-0.14
sue
-0.14
idon
-0.14
å¯
-0.14
erm
-0.14
POSITIVE LOGITS
officers
0.19
offending
0.17
officer
0.16
breaches
0.16
Officers
0.16
sentence
0.15
breached
0.15
Mr
0.15
Mr
0.14
matters
0.14
Activations Density 0.018%