INDEX
Explanations
mentions of legal charges and criminal activities
New Auto-Interp
Negative Logits
-valu
-0.16
Prostit
-0.15
νι
-0.15
Assass
-0.15
hiba
-0.15
eref
-0.14
icens
-0.14
eroon
-0.14
ibern
-0.14
icare
-0.14
POSITIVE LOGITS
charged
0.36
accused
0.32
suspects
0.29
charged
0.28
charges
0.28
suspected
0.27
Charg
0.27
alleged
0.27
charge
0.25
suspect
0.25
Activations Density 0.140%