INDEX
Explanations
instances of content related to creation or production
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
150
+0.13
0.7%
376
+0.12
0.7%
15
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
148
+0.13
0.02
15
+0.12
0.09
277
+0.12
0.07
Negative Logits
»¿
-3.44
į
-2.94
ĭ
-2.85
ĥ
-2.84
Ĩ
-2.81
ij
-2.73
Ģ
-2.71
¢
-2.59
Ļª
-2.56
Ĥ¬
-2.47
POSITIVE LOGITS
leine
1.90
by
1.75
using
1.63
annealing
1.58
irectory
1.57
InstanceState
1.56
versions
1.54
ications
1.52
uate
1.51
ocument
1.46
Activations Density 1.001%