INDEX
Explanations
words related to criminal activities, especially robberies
instances of the word "robbery" and its variations
New Auto-Interp
Negative Logits
UTC
-0.78
mite
-0.77
Plex
-0.76
Reviewer
-0.76
mitt
-0.70
EMP
-0.70
rolog
-0.69
TPPStreamerBot
-0.69
rum
-0.69
audi
-0.68
POSITIVE LOGITS
robbery
1.24
robberies
1.10
spree
1.07
robbers
0.94
robbing
0.91
robber
0.88
robbed
0.85
eering
0.78
sters
0.74
eers
0.74
Activations Density 0.016%