INDEX
Explanations
terms related to the deterioration or decline of quality or performance
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
156
+0.15
0.9%
376
+0.14
0.8%
100
+0.13
0.8%
Correlated Neurons
Index
P. Corr.
Cos Sim.
319
+0.15
0.01
200
+0.14
0.01
451
+0.13
0.01
Negative Logits
ONS
-1.58
lı
-1.58
full
-1.51
hes
-1.50
someone
-1.49
somebody
-1.48
ikh
-1.47
uka
-1.46
Licensed
-1.45
Publ
-1.44
POSITIVE LOGITS
Ł
2.19
ģ
2.12
ļ
2.12
°
1.98
¤
1.96
Ģ
1.93
ĺ
1.93
ij
1.84
ĩ
1.84
¡
1.82
Activations Density 0.035%