INDEX
Explanations
rankings and statistical information related to performance or evaluation
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1978
+0.12
0.4%
674
+0.12
0.4%
1385
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1081
+0.12
0.03
994
+0.12
0.02
1655
+0.10
0.03
Negative Logits
shenan
-1.22
gaily
-1.22
maneu
-1.17
guarante
-1.16
affor
-1.15
snoopy
-1.14
indestru
-1.13
impra
-1.13
milf
-1.12
depic
-1.12
POSITIVE LOGITS
top
0.87
<bos>
0.87
Top
0.83
top
0.83
Top
0.82
TOP
0.75
TOP
0.68
tops
0.61
ten
0.61
0
0.59
Activations Density 0.102%