INDEX
Explanations
mentions of individuals being arrested for criminal activities
phrases indicating arrests and legal charges
New Auto-Interp
Negative Logits
pher
-0.77
jar
-0.70
phabet
-0.68
wolves
-0.68
antha
-0.67
rians
-0.65
holes
-0.65
patch
-0.63
itar
-0.63
oom
-0.63
POSITIVE LOGITS
suspicion
1.40
behalf
1.27
charges
1.17
etime
1.04
misdemeanor
0.95
grounds
0.95
felony
0.94
spurious
0.94
suspicions
0.93
accusations
0.93
Activations Density 0.081%