INDEX
Explanations
instances of the word "flex."
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
164
+0.17
0.9%
376
+0.14
0.8%
189
+0.12
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
189
+0.17
0.01
164
+0.14
0.01
243
+0.12
0.01
Negative Logits
Briefly
-1.81
Spacewatch
-1.68
slightest
-1.48
soever
-1.46
beginnings
-1.44
adult
-1.41
perg
-1.41
odot
-1.40
same
-1.37
Calif
-1.37
POSITIVE LOGITS
ibly
2.10
ÃŃvel
1.80
istry
1.71
ors
1.61
gap
1.60
ificates
1.59
version
1.58
ENTIAL
1.55
sheet
1.54
chain
1.54
Activations Density 0.012%