INDEX
Explanations
programming-related terms and syntax errors
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
115
+0.13
0.7%
71
+0.12
0.6%
288
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
155
+0.13
0.01
159
+0.12
0.01
71
+0.11
0.01
Negative Logits
Ĭ
-2.28
ĺ
-2.17
«
-2.14
ij
-2.09
Ħ
-2.08
¼
-2.01
¿½
-2.00
į
-1.97
ī
-1.94
¢
-1.92
POSITIVE LOGITS
pora
1.82
pen
1.55
antine
1.54
fee
1.54
dorff
1.51
minster
1.50
cro
1.49
inical
1.49
geries
1.49
oon
1.48
Activations Density 0.020%