INDEX
Explanations
percentages and statistics related to public opinion and political preferences
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1535
+0.24
0.8%
382
+0.19
0.6%
2034
+0.18
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.24
0.08
1535
+0.19
0.07
310
+0.18
0.05
Negative Logits
sappi
-1.14
viciss
-1.08
Leurs
-1.03
churrasco
-0.99
embodi
-0.99
uncin
-0.97
gsx
-0.96
Jusqu
-0.96
Darum
-0.93
piña
-0.93
POSITIVE LOGITS
↵↵
0.83
Even
0.80
However
0.76
This
0.74
So
0.73
Thus
0.73
Therefore
0.73
Also
0.71
<eos>
0.69
In
0.68
Activations Density 0.292%