INDEX
Explanations
words related to criminal activities
references to crime
New Auto-Interp
Negative Logits
ersed
-0.79
achev
-0.78
por
-0.73
arity
-0.69
TPPStreamerBot
-0.68
uran
-0.68
zl
-0.66
Dak
-0.66
)=(
-0.66
paren
-0.65
POSITIVE LOGITS
spree
1.09
synd
0.84
fighting
0.81
unfocusedRange
0.80
offenses
0.80
perpetrated
0.80
punishable
0.80
fighter
0.78
offences
0.78
crimes
0.77
Activations Density 0.022%