INDEX
Explanations
positive evaluations and comparisons
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1257
+0.09
0.2%
62
+0.08
0.2%
12
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
725
+0.09
0.04
1358
+0.08
0.07
62
+0.07
0.05
Negative Logits
⇐
-0.63
manuten
-0.61
hunde
-0.61
praktik
-0.61
keramik
-0.60
prega
-0.60
Bakter
-0.60
adal
-0.59
akus
-0.59
alkoh
-0.58
POSITIVE LOGITS
but
1.51
but
1.29
nhưng
1.24
But
1.20
But
1.18
BUT
1.18
pero
1.10
แต่
1.01
BUT
1.00
lakini
1.00
Activations Density 1.056%