INDEX
Explanations
possessive pronouns referring to "my" and "your"
New Auto-Interp
Negative Logits
大åħ¨
-0.16
ipar
-0.15
itious
-0.15
dre
-0.15
spar
-0.14
others
-0.14
optera
-0.14
frey
-0.14
SV
-0.14
uality
-0.14
POSITIVE LOGITS
own
0.24
Own
0.20
Own
0.17
SELF
0.16
own
0.16
self
0.16
rtle
0.16
imary
0.15
اÛĮØ´
0.15
naments
0.14
Activations Density 0.114%