INDEX
Explanations
initial document boundary markers and beginning‐of‐sequence cues
instances of punctuation or quotation marks in text
New Auto-Interp
Negative Logits
Shakspeare
-1.70
myſelf
-1.70
Efq
-1.64
Monfieur
-1.60
itſelf
-1.51
Theſe
-1.48
Houſe
-1.45
―――――
-1.44
Majefty
-1.43
photolibrary
-1.43
POSITIVE LOGITS
the
1.23
a
1.06
in
0.97
and
0.93
an
0.92
,
0.90
on
0.89
as
0.88
it
0.86
-
0.86
Activations Density 0.672%