INDEX
Explanations
quantitative values and measurements
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
876
+0.10
0.3%
1253
+0.09
0.3%
1385
+0.09
0.2%
Correlated Neurons
Index
P. Corr.
Cos Sim.
569
+0.10
0.04
1398
+0.09
0.02
1391
+0.09
0.04
Negative Logits
marié
-0.77
unwarran
-0.69
medesimo
-0.69
pamph
-0.67
occupe
-0.65
Whigs
-0.65
mécanisme
-0.64
déclarations
-0.64
accueille
-0.64
obligé
-0.63
POSITIVE LOGITS
reputa
0.70
lega
0.66
vola
0.66
lomb
0.66
pexpr
0.66
beren
0.65
SneakyThrows
0.64
bont
0.63
tota
0.63
vernac
0.62
Activations Density 0.358%