INDEX
Explanations
names of political figures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1177
+0.18
0.7%
381
+0.18
0.7%
1741
+0.16
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
981
+0.18
0.09
1177
+0.18
0.04
227
+0.16
0.07
Negative Logits
-0.74
or
-0.72
.
-0.70
but
-0.70
↵↵
-0.68
in
-0.68
to
-0.68
all
-0.67
↵
-0.66
so
-0.66
POSITIVE LOGITS
alkoh
1.59
kram
1.55
Præ
1.53
minimalis
1.52
kosme
1.51
karton
1.50
kompati
1.48
keramik
1.48
silikon
1.47
antik
1.46
Activations Density 0.420%