INDEX
Explanations
general terms and punctuation that may indicate formatting or structural elements in the text
New Auto-Interp
Negative Logits
myſelf
-1.93
itſelf
-1.91
Efq
-1.81
Monfieur
-1.79
ſelf
-1.78
ſelves
-1.78
Theſe
-1.74
Jefus
-1.74
―――――
-1.73
themſelves
-1.72
POSITIVE LOGITS
1.45
<eos>
1.23
↵↵
1.09
'
1.02
!
1.01
.
1.01
...
1.00
-
1.00
a
0.99
to
0.97
Activations Density 0.949%