INDEX
Explanations
repeated use of the first-person pronoun "I."
New Auto-Interp
Negative Logits
Theſe
-1.17
Beſ
-1.06
Efq
-1.05
Reſ
-1.03
Monfieur
-1.00
Eſ
-0.98
Anſ
-0.96
Houſe
-0.96
―――――
-0.95
ſeveral
-0.93
POSITIVE LOGITS
I
2.90
I
2.13
we
1.43
i
1.38
We
1.38
my
1.32
я
1.28
My
1.28
我
1.26
We
1.21
Activations Density 0.243%