INDEX
Negative Logits
itſelf
-1.16
avoient
-1.12
varandra
-1.12
verwijspagina
-1.11
Diſ
-1.11
ujednoznacz
-1.09
Efq
-1.09
Monfieur
-1.09
étoient
-1.09
Reſ
-1.08
POSITIVE LOGITS
and
0.83
of
0.81
in
0.81
for
0.77
.
0.75
,
0.66
0.65
(
0.63
on
0.61
↵↵
0.59
Activations Density 0.079%