INDEX
Explanations
formatting indicators or placeholders in code or text
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
0
+0.15
0.9%
115
+0.12
0.7%
16
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
0
+0.15
0.04
261
+0.12
0.03
390
+0.12
0.03
Negative Logits
©
-3.00
ľ
-2.74
Ń
-2.73
ı
-2.71
´
-2.70
º
-2.70
»
-2.70
ħ
-2.69
·
-2.67
¸
-2.66
POSITIVE LOGITS
umab
1.62
endi
1.56
chitz
1.53
iliar
1.51
laws
1.50
rpm
1.49
iani
1.48
!”
1.45
ylon
1.45
ureus
1.45
Activations Density 0.020%