INDEX
Explanations
references to the concept of brevity or shortened duration
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.25
1.4%
376
+0.17
1.0%
315
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
296
+0.25
0.01
422
+0.17
0.01
315
+0.12
0.01
Negative Logits
ĵ
-2.45
Ħ
-2.30
»
-2.27
§
-2.10
ķ
-2.05
ij
-1.98
Ľ
-1.96
Ĩ
-1.94
¨
-1.90
¿
-1.87
POSITIVE LOGITS
than
2.85
Than
2.77
than
2.57
Than
2.26
dated
1.71
($\
1.70
times
1.60
wise
1.55
generations
1.52
acting
1.50
Activations Density 0.012%