INDEX
Explanations
phrases that emphasize the use of the first-person pronoun "I."
New Auto-Interp
Negative Logits
lopen
-0.08
bilt
-0.08
áp
-0.08
itself
-0.07
bidden
-0.07
ارس
-0.07
gether
-0.07
ousand
-0.07
lx
-0.07
ENAME
-0.07
POSITIVE LOGITS
’m
0.11
'm
0.11
've
0.10
’ve
0.09
myself
0.09
am
0.09
zzo
0.09
'll
0.09
бÑĥдÑĥ
0.08
zelf
0.08
Activations Density 0.115%