INDEX
Explanations
phrases related to violent incidents
mentions of shooting incidents
New Auto-Interp
Negative Logits
STEP
-0.81
hw
-0.81
Label
-0.78
ebook
-0.77
obo
-0.76
ateg
-0.76
ummies
-0.74
ulla
-0.73
GY
-0.73
undai
-0.70
POSITIVE LOGITS
shooting
1.21
Shooting
1.04
spree
1.02
shoot
1.01
shootings
0.96
Shoot
0.91
nikov
0.90
shoots
0.89
shooters
0.87
powder
0.86
Activations Density 0.012%