INDEX
Explanations
phrases related to violent events, specifically shootings
references to incidents of gun violence or shootings
New Auto-Interp
Negative Logits
hw
-0.93
GY
-0.83
Label
-0.80
undai
-0.79
hern
-0.77
ateg
-0.77
redo
-0.76
ebook
-0.74
atom
-0.73
Cola
-0.71
POSITIVE LOGITS
spree
1.31
rampage
1.02
shooting
0.87
shootings
0.82
scenes
0.81
shoot
0.80
powder
0.79
Shoot
0.79
nikov
0.77
scene
0.76
Activations Density 0.022%