INDEX
Explanations
references to educational or academic classifications
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
490
+0.14
0.8%
23
+0.14
0.8%
170
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
125
+0.14
0.03
23
+0.14
0.01
277
+0.13
0.03
Negative Logits
usp
-1.71
ably
-1.71
cil
-1.66
imize
-1.64
ariat
-1.63
iator
-1.63
iate
-1.48
ener
-1.44
ue
-1.44
uate
-1.43
POSITIVE LOGITS
"];
2.06
'];
1.66
omitempty
1.59
vasive
1.57
laws
1.45
omorphisms
1.45
']);
1.43
/~
1.37
laws
1.36
ourselves
1.35
Activations Density 0.225%