INDEX
Explanations
statements of confirmation or validation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
376
+0.19
1.1%
443
+0.17
1.0%
72
+0.16
0.9%
Correlated Neurons
Index
P. Corr.
Cos Sim.
72
+0.19
0.02
443
+0.17
0.02
118
+0.16
0.02
Negative Logits
ainen
-1.73
":["
-1.54
wered
-1.42
![**
-1.35
ilde
-1.35
hens
-1.35
aled
-1.33
owners
-1.32
\":
-1.32
akes
-1.31
POSITIVE LOGITS
ĻĤ
2.98
ļ
2.60
ľĵ
2.41
¾
2.31
Ĩ
2.19
Ļ
2.18
Īĺ
2.16
ĩ
2.09
ĸ
2.04
¤
2.04
Activations Density 0.088%