INDEX
Explanations
mathematical symbols and constructions related to differentiation and statistical measures
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
275
+0.14
0.8%
320
+0.13
0.8%
369
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
275
+0.14
0.02
79
+0.13
0.02
370
+0.12
0.03
Negative Logits
/">
-1.95
)</
-1.78
()</
-1.73
/)
-1.67
/"
-1.67
"/>
-1.65
++)
-1.57
/#
-1.55
Âł↵
-1.53
ASC
-1.52
POSITIVE LOGITS
·
4.02
Ĭ
3.93
ģ
3.72
ĭ
3.69
¶
3.67
İ
3.66
Īĺ
3.65
¥
3.64
ı
3.55
Ļ
3.49
Activations Density 0.088%