INDEX
Explanations
references to the concept of improvement and aspects of glory
New Auto-Interp
Negative Logits
myſelf
-1.99
itſelf
-1.98
Efq
-1.97
Jefus
-1.84
raiſ
-1.80
themſelves
-1.80
Majefty
-1.79
ſelves
-1.78
poffible
-1.76
ſtate
-1.75
POSITIVE LOGITS
<
0.95
0.93
↵↵
0.89
0.88
.
0.85
<eos>
0.84
In
0.83
to
0.82
A
0.82
(
0.82
Activations Density 0.217%