INDEX
Explanations
phrases related to violent events, specifically shootings
references to violent incidents, specifically shootings
New Auto-Interp
Negative Logits
GY
-0.85
undai
-0.79
ateg
-0.77
annis
-0.74
Label
-0.73
hw
-0.73
eday
-0.73
adian
-0.72
onge
-0.72
uchin
-0.71
POSITIVE LOGITS
spree
1.33
rampage
1.12
deaths
0.91
shootings
0.86
nikov
0.84
powder
0.80
shooting
0.80
scenes
0.79
massacre
0.79
victims
0.78
Activations Density 0.026%