INDEX
Explanations
email address information
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
7
+0.14
0.8%
468
+0.13
0.7%
309
+0.13
0.7%
Correlated Neurons
Index
P. Corr.
Cos Sim.
198
+0.14
0.04
468
+0.13
0.02
309
+0.13
0.01
Negative Logits
¿
-3.13
ķ
-2.91
»
-2.85
ł
-2.82
Ŀ
-2.79
»¿
-2.79
ı
-2.67
¢
-2.65
Į
-2.60
ĸ´
-2.53
POSITIVE LOGITS
"}](#
1.61
Pradesh
1.53
dissolve
1.49
ivation
1.45
eeper
1.43
Disorder
1.37
together
1.34
Circuit
1.34
VERTIS
1.32
Abrams
1.30
Activations Density 0.547%