INDEX
Explanations
personal interactions with the speaker being referred to
instances of the word "me."
New Auto-Interp
Negative Logits
Equality
-0.68
earable
-0.64
Compliance
-0.62
Pric
-0.61
raviolet
-0.61
tide
-0.61
profits
-0.60
Centers
-0.60
tides
-0.59
Us
-0.59
POSITIVE LOGITS
zzo
1.12
adows
1.09
adow
1.02
personally
1.01
lees
0.97
imei
0.96
andering
0.91
cca
0.87
gal
0.80
zz
0.80
Activations Density 0.097%