INDEX
Explanations
instances of the pronoun 'I'
New Auto-Interp
Negative Logits
Theſe
-1.22
Beſ
-1.10
―――――
-0.96
ſeveral
-0.88
themſelves
-0.85
Anſ
-0.84
Monfieur
-0.84
Cuen
-0.81
whoſe
-0.81
abond
-0.80
POSITIVE LOGITS
I
2.01
I
1.70
i
1.26
iI
0.98
pI
0.97
я
0.95
O
0.94
𝗜
0.93
आई
0.91
J
0.91
Activations Density 0.481%