INDEX
Explanations
punctuation marks and phrases indicating time or dates
Follows punctuation like commas and periods
commas followed by continuations
New Auto-Interp
Negative Logits
myſelf
-1.07
Majefty
-1.03
itſelf
-1.02
$_"
-1.01
Jefus
-1.01
Efq
-0.99
betweenstory
-0.99
становника
-0.97
Personensuche
-0.97
saraba
-0.97
POSITIVE LOGITS
I
0.85
<eos>
0.76
I
0.73
we
0.70
↵↵
0.69
.
0.66
↵
0.64
it
0.61
(
0.61
The
0.61
Activations Density 0.689%