INDEX
Explanations
repetitive usage of the word "I."
New Auto-Interp
Negative Logits
Efq
-1.29
―――――
-1.26
itſelf
-1.25
faſt
-1.14
ſelves
-1.14
་་
-1.13
iſt
-1.12
Monfieur
-1.10
ſeveral
-1.09
myſelf
-1.08
POSITIVE LOGITS
it
1.13
I
1.09
he
1.09
we
1.08
0.84
she
0.84
you
0.83
↵↵
0.79
we
0.78
it
0.77
Activations Density 0.173%