INDEX
Explanations
numerical values related to statistics or measurements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
1978
+0.08
0.2%
1385
+0.07
0.2%
1473
+0.07
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1473
+0.08
0.02
1471
+0.07
0.03
1754
+0.07
0.03
Negative Logits
بين
-0.58
like
-0.58
beginning
-0.55
hard
-0.55
ten
-0.55
making
-0.55
To
-0.55
-0.54
i
-0.54
saying
-0.54
POSITIVE LOGITS
meis
1.62
igno
1.56
grati
1.45
saar
1.45
sappi
1.43
ordina
1.42
fatis
1.41
imposs
1.41
suspic
1.40
hina
1.37
Activations Density 0.100%