INDEX
Explanations
references to spans of time, especially "week" and "early"
New Auto-Interp
Negative Logits
myſelf
-1.34
ſelf
-1.18
".
-1.13
―――――
-1.13
*/;
-1.05
^(@)
-1.05
Efq
-1.05
itſelf
-1.03
himſelf
-1.02
Monfieur
-1.02
POSITIVE LOGITS
↵↵
0.99
-
0.82
;
0.74
↵
0.73
.
0.72
<eos>
0.69
!
0.63
[
0.62
;
0.60
\
0.57
Activations Density 0.861%