INDEX
Explanations
phrases or words related to violent actions, especially killing
references to killings and related violent acts
New Auto-Interp
Negative Logits
Cola
-0.72
isse
-0.69
rypt
-0.68
gow
-0.68
fty
-0.67
TAG
-0.67
MpServer
-0.67
Discuss
-0.66
BuyableInstoreAndOnline
-0.66
Gi
-0.65
POSITIVE LOGITS
spree
1.23
houses
0.93
killings
0.83
murdering
0.80
massac
0.79
rampage
0.78
killing
0.77
knife
0.76
killing
0.76
murders
0.76
Activations Density 0.023%