INDEX
Explanations
expressions of political commentary and actions related to governance
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
369
+0.26
1.6%
410
+0.17
1.1%
419
+0.17
1.0%
Correlated Neurons
Index
P. Corr.
Cos Sim.
369
+0.26
0.08
410
+0.17
0.16
418
+0.17
0.01
Negative Logits
"}](#
-1.58
ook
-1.54
itian
-1.52
ernel
-1.49
@"
-1.34
sino
-1.32
ourier
-1.31
aussian
-1.31
ubern
-1.29
ouss
-1.27
POSITIVE LOGITS
CONCLUSION
1.65
********************************
1.60
************************
1.60
****************
1.53
nex
1.51
****************************************************************
1.49
NOTE
1.47
further
1.45
↵Č
1.42
iative
1.41
Activations Density 1.547%