INDEX
Explanations
names of individuals or organizations
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
395
+0.16
0.7%
1124
+0.16
0.7%
1926
+0.13
0.5%
Correlated Neurons
Index
P. Corr.
Cos Sim.
395
+0.16
0.04
1056
+0.16
0.05
227
+0.13
0.05
Negative Logits
<bos>
-1.78
guang
-1.37
xiu
-1.35
qiao
-1.25
anyuan
-1.21
qian
-1.17
huo
-1.11
Xiu
-1.09
xun
-1.06
zheng
-1.04
POSITIVE LOGITS
Strukt
0.87
alkoh
0.87
Lins
0.86
quoique
0.83
Abbé
0.79
Heeren
0.79
kosme
0.76
kompati
0.75
geograf
0.75
rcParams
0.75
Activations Density 0.244%