INDEX
Explanations
first-person singular pronouns ('I')
instances of the pronoun "I" and related self-referential expressions
New Auto-Interp
Negative Logits
pires
-0.65
Materials
-0.59
Georgian
-0.57
Gale
-0.57
optics
-0.57
Uriel
-0.57
eers
-0.57
ieves
-0.56
Leban
-0.55
us
-0.53
POSITIVE LOGITS
'm
1.67
am
1.27
verson
0.99
aido
0.96
ggy
0.92
xtap
0.89
ANA
0.89
've
0.89
myself
0.87
Am
0.86
Activations Density 0.189%