INDEX
Explanations
references to firearms, gun violence, and related policies or discussions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
479
+0.15
0.6%
1296
+0.14
0.5%
1837
+0.11
0.4%
Correlated Neurons
Index
P. Corr.
Cos Sim.
479
+0.15
0.05
1296
+0.14
0.04
1837
+0.11
0.04
Negative Logits
intermitt
-0.64
stanley
-0.61
reluct
-0.61
reconno
-0.61
suscep
-0.60
fischer
-0.60
krab
-0.58
disreg
-0.58
ert
-0.58
rita
-0.58
POSITIVE LOGITS
gun
1.40
Gun
1.31
guns
1.29
Gun
1.28
gun
1.23
GUN
1.16
Guns
1.15
Guns
1.10
guns
1.09
GUN
1.07
Activations Density 0.088%