INDEX
Explanations
terms related to architectural elements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1472
+0.10
0.3%
1306
+0.08
0.2%
1331
+0.08
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1896
+0.10
0.02
1331
+0.08
0.03
1343
+0.08
0.03
Negative Logits
<bos>
-0.68
viena
-0.62
tiks
-0.59
veik
-0.59
papild
-0.55
///**
-0.52
ConverterFactory
-0.52
didel
-0.51
įsi
-0.50
weten
-0.49
POSITIVE LOGITS
bal
2.18
bal
2.05
Bal
1.80
Ballot
1.71
Bal
1.69
BAL
1.67
BAL
1.65
Ballo
1.56
Balo
1.52
ballot
1.50
Activations Density 0.239%