INDEX
Negative Logits
.
-1.05
to
-0.98
,
-0.90
-
-0.87
and
-0.81
↵↵
-0.77
that
-0.73
(
-0.71
to
-0.65
in
-0.65
POSITIVE LOGITS
myſelf
1.59
itſelf
1.58
―――――
1.53
Efq
1.47
Theſe
1.42
themſelves
1.36
Jefus
1.36
Monfieur
1.35
iſt
1.33
himſelf
1.33
Activations Density 0.162%