INDEX
Explanations
references to the concept of "mid."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
369
+0.14
0.8%
20
+0.11
0.6%
140
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
20
+0.14
0.02
420
+0.11
0.02
386
+0.11
0.01
Negative Logits
ftware
-1.71
ptions
-1.64
subscribe
-1.60
Ķ
-1.51
uch
-1.50
omitempty
-1.49
complete
-1.49
ĸ
-1.45
OURCES
-1.44
ccess
-1.40
POSITIVE LOGITS
azol
2.19
wick
2.10
way
2.02
range
1.88
fielder
1.87
rone
1.79
town
1.76
stad
1.76
ermost
1.73
strom
1.71
Activations Density 0.045%