INDEX
Explanations
phrases related to follow-up actions and requests
New Auto-Interp
Neuron Alignment
Index
Value
% of L₁
776
+0.12
0.4%
188
+0.11
0.3%
674
+0.10
0.3%
Correlated Neurons
Index
P. Corr.
Cos Sim.
1786
+0.12
0.01
406
+0.11
0.01
370
+0.10
0.02
Negative Logits
tinte
-0.65
siff
-0.63
marte
-0.60
Cartes
-0.57
ordina
-0.56
litos
-0.55
Chapitre
-0.54
Ouv
-0.54
Cfr
-0.54
Garanti
-0.53
POSITIVE LOGITS
followup
0.95
follow
0.68
impelled
0.63
vainly
0.62
unve
0.62
disambigu
0.60
follow
0.59
disagre
0.58
ujedno
0.57
glowing
0.56
Activations Density 0.099%