INDEX
Explanations
structured data and summaries in code snippets
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
461
+0.14
0.8%
287
+0.13
0.8%
198
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
131
+0.14
0.05
21
+0.13
0.04
461
+0.13
0.05
Negative Logits
dala
-1.50
inib
-1.48
soda
-1.42
interested
-1.41
nai
-1.40
)](
-1.36
sit
-1.32
egg
-1.31
erty
-1.30
.’”
-1.30
POSITIVE LOGITS
bound
1.63
reflections
1.53
COPYRIGHT
1.43
Apply
1.42
Dictionary
1.40
Register
1.37
Invalid
1.34
hereby
1.33
Redistributions
1.32
exceptions
1.29
Activations Density 0.558%