INDEX
Explanations
Java function definitions and variable assignments
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
478
+0.13
0.7%
93
+0.12
0.7%
294
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
157
+0.13
0.08
22
+0.12
0.05
170
+0.11
0.06
Negative Logits
edo
-1.74
amer
-1.72
[^
-1.63
grade
-1.50
versions
-1.48
)\]
-1.47
orf
-1.46
rfloor
-1.46
fixed
-1.44
opsis
-1.43
POSITIVE LOGITS
¥
1.99
Ĩ
1.93
¿
1.93
®
1.86
"#
1.85
¾
1.85
"";
1.83
"@
1.77
ī
1.77
į
1.76
Activations Density 0.592%