INDEX
Explanations
keywords related to criminal activities, particularly murder
references to the word "murder."
New Auto-Interp
Negative Logits
zl
-0.77
Stud
-0.71
english
-0.68
Scot
-0.68
arta
-0.66
UTC
-0.66
reusable
-0.65
une
-0.64
DIR
-0.64
Cola
-0.64
POSITIVE LOGITS
murder
1.15
spree
1.08
murders
1.05
homicide
1.04
Murder
0.97
homicides
0.95
hyde
0.93
murderer
0.92
manslaughter
0.92
killings
0.90
Activations Density 0.014%