INDEX
Explanations
information related to political analysis
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1535
+0.20
0.6%
382
+0.19
0.6%
2034
+0.16
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
382
+0.20
0.10
1535
+0.19
0.07
310
+0.16
0.04
Negative Logits
inev
-1.58
emphat
-1.54
suspic
-1.53
desir
-1.47
guarante
-1.45
reluct
-1.45
increa
-1.44
fuf
-1.43
nece
-1.43
effe
-1.42
POSITIVE LOGITS
Therefore
1.01
Furthermore
0.92
Moreover
0.91
Therefore
0.85
Even
0.80
Thus
0.79
Hence
0.79
Moreover
0.76
Nevertheless
0.76
Furthermore
0.73
Activations Density 0.463%