INDEX
Explanations
references to previously established or current entities, agreements, or systems
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
420
+0.13
0.7%
412
+0.11
0.6%
486
+0.11
0.6%
Correlated Neurons
Index
P. Corr.
Cos Sim.
201
+0.13
0.02
121
+0.11
0.02
463
+0.11
0.02
Negative Logits
ĺ
-2.74
ĻĤ
-2.50
Ł
-2.47
ľ
-2.39
¡
-2.34
ļ
-2.32
Ļª
-2.27
IJ
-2.25
¥
-2.24
ı
-2.23
POSITIVE LOGITS
havior
1.62
ão
1.60
interface
1.55
others
1.53
poons
1.51
ambda
1.50
oles
1.41
ails
1.41
fficients
1.40
ribe
1.40
Activations Density 0.006%