INDEX
Explanations
punctuation marks that indicate dialogue or quoted speech
New Auto-Interp
Negative Logits
myſelf
-2.10
Monfieur
-2.00
itſelf
-1.99
pleaſure
-1.97
Anſ
-1.97
―――――
-1.92
Reſ
-1.91
Efq
-1.91
Theſe
-1.80
purpoſe
-1.80
POSITIVE LOGITS
1.32
“
1.19
,
1.12
↵
1.09
"
1.00
0.98
(
0.98
<eos>
0.98
.
0.97
and
0.96
Activations Density 0.155%