INDEX
Explanations
instances of the word "become"
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.19
1.1%
376
+0.17
1.0%
304
+0.15
0.9%
Correlated Neurons
Index
P. Corr.
Cos Sim.
304
+0.19
0.05
193
+0.17
0.04
240
+0.15
0.03
Negative Logits
lor
-1.62
inho
-1.54
ierno
-1.50
ail
-1.49
isesti
-1.49
ounded
-1.49
----------------------------------------------------------------------------------------------------------------
-1.46
ERY
-1.44
seys
-1.42
-----------------
-1.42
POSITIVE LOGITS
¼
2.76
¶
2.50
°
2.27
«
2.25
IJ
2.19
²
2.17
¨
2.09
Ĩ
2.07
µ
2.07
Ĭ
2.06
Activations Density 3.096%