INDEX
Explanations
various forms of contractions and punctuation marks
New Auto-Interp
Negative Logits
<eos>
-0.70
-0.46
.
-0.46
l
-0.45
(
-0.41
im
-0.40
↵
-0.40
den
-0.39
mod
-0.38
_
-0.38
POSITIVE LOGITS
myſelf
1.56
itſelf
1.55
purpoſe
1.53
Majefty
1.52
pleaſure
1.47
poffible
1.46
raiſ
1.45
themſelves
1.43
Efq
1.42
poffe
1.41
Activations Density 0.135%