INDEX
Explanations
specific formatting or code syntax
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
444
+0.14
0.8%
71
+0.13
0.7%
457
+0.12
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
444
+0.14
0.02
404
+0.13
0.01
111
+0.12
0.01
Negative Logits
acular
-1.91
naments
-1.81
akers
-1.69
urope
-1.68
rica
-1.66
igated
-1.58
argin
-1.56
tering
-1.55
chers
-1.52
pering
-1.49
POSITIVE LOGITS
£
2.06
¥
1.95
Ļ
1.93
Ĵ
1.89
§
1.82
·
1.69
ĺ
1.69
į
1.67
¤
1.67
Ĩ
1.66
Activations Density 0.013%