INDEX
Explanations
a recurring pattern in the text, specifically the representation of the abbreviation "eb."
New Auto-Interp
Negative Logits
Efq
-1.81
Monfieur
-1.70
Majefty
-1.66
pleaſure
-1.62
houſe
-1.58
myſelf
-1.55
Houſe
-1.53
itſelf
-1.48
Theſe
-1.45
Jefus
-1.45
POSITIVE LOGITS
0.94
I
0.82
↵↵
0.79
[
0.76
B
0.74
↵
0.73
in
0.73
(
0.72
<eos>
0.69
T
0.69
Activations Density 0.321%