INDEX
Explanations
phrases related to political figures and campaigns
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1842
+0.11
0.3%
609
+0.09
0.3%
1253
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
392
+0.11
0.03
81
+0.09
0.02
599
+0.07
0.06
Negative Logits
unwarran
-1.32
impractica
-1.26
disagre
-1.24
reluct
-1.23
unlaw
-1.19
increa
-1.19
indestru
-1.18
encomp
-1.17
McLaugh
-1.12
practition
-1.12
POSITIVE LOGITS
voters
0.57
anship
0.55
vision
0.52
candidate
0.52
ContentAsync
0.52
Bilder
0.51
integrity
0.51
promise
0.51
investi
0.51
endcsname
0.50
Activations Density 0.544%