INDEX
Explanations
patterns and variables within mathematical or statistical expressions
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
497
+0.12
0.7%
47
+0.11
0.6%
275
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
275
+0.12
0.02
497
+0.11
0.01
112
+0.11
0.04
Negative Logits
iquity
-1.59
igu
-1.51
oring
-1.49
olo
-1.45
*"
-1.44
ozo
-1.43
igen
-1.42
amycin
-1.41
aments
-1.40
ongo
-1.39
POSITIVE LOGITS
ģ
1.72
®
1.70
ĻĤ
1.67
«
1.63
↵
1.48
↵↵
1.48
č↵č↵
1.48
1.48
1.48
↵
1.48
Activations Density 0.049%