INDEX
Explanations
the beginning of a document or section
New Auto-Interp
Negative Logits
,
-0.72
-
-0.69
to
-0.69
al
-0.64
in
-0.63
for
-0.61
at
-0.59
(
-0.59
:
-0.59
is
-0.57
POSITIVE LOGITS
myſelf
1.38
Jefus
1.21
Majefty
1.20
itſelf
1.20
Houſe
1.19
ſeveral
1.17
Reſ
1.17
ſelves
1.17
ſelf
1.16
themſelves
1.16
Activations Density 0.009%