INDEX
Explanations
comments or documentation sections in code
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
240
+0.13
0.7%
410
+0.12
0.7%
478
+0.12
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
411
+0.13
0.03
165
+0.12
0.03
463
+0.12
0.02
Negative Logits
ata
-1.57
ight
-1.55
ath
-1.54
side
-1.52
bow
-1.47
rish
-1.45
shire
-1.43
sid
-1.36
oi
-1.36
TON
-1.35
POSITIVE LOGITS
¿½
1.92
½
1.59
headed
1.56
ourselves
1.53
inement
1.52
(@
1.52
:--
1.50
cases
1.48
ĥ½
1.45
formance
1.44
Activations Density 0.068%