INDEX
Explanations
phrases related to criminal activity and law enforcement
references to criminals and criminal activity
New Auto-Interp
Negative Logits
zl
-0.66
orse
-0.65
bid
-0.64
ww
-0.63
chron
-0.61
intend
-0.61
/+
-0.61
ories
-0.60
ctive
-0.59
Wo
-0.59
POSITIVE LOGITS
criminals
1.03
offenders
0.96
gangs
0.89
prey
0.88
offend
0.88
mastermind
0.87
trespass
0.85
infring
0.84
offender
0.80
smugglers
0.78
Activations Density 0.012%