INDEX
Explanations
references to violent acts or incidents involving firearms
New Auto-Interp
Negative Logits
еÑĢÑĸв
-0.16
bjerg
-0.15
Unsafe
-0.15
ùng
-0.14
GT
-0.14
ê¹
-0.13
aes
-0.13
hovering
-0.13
umas
-0.13
hsi
-0.13
POSITIVE LOGITS
motive
0.21
lon
0.19
rampage
0.19
motives
0.15
motivation
0.15
randomness
0.15
irsch
0.15
iac
0.15
targeting
0.15
method
0.15
Activations Density 0.045%