INDEX
Explanations
references to crime and criminal plots
New Auto-Interp
Negative Logits
abuse
-0.19
Abuse
-0.18
abuses
-0.18
abusive
-0.17
abusing
-0.15
abus
-0.15
丸
-0.14
SECRET
-0.14
éŀ
-0.14
assassin
-0.14
POSITIVE LOGITS
rob
0.37
robbed
0.35
robbery
0.35
robber
0.34
rob
0.33
getaway
0.30
Rob
0.27
burg
0.27
burgl
0.26
Rob
0.25
Activations Density 0.122%