INDEX
Explanations
references to personal pronouns and their usage
New Auto-Interp
Negative Logits
obel
-0.17
ibur
-0.15
Compat
-0.15
trak
-0.15
oned
-0.15
'=>"
-0.15
lum
-0.14
ieur
-0.14
nehmen
-0.14
æ¥
-0.14
POSITIVE LOGITS
own
0.17
themselves
0.17
mình
0.16
rown
0.15
SELF
0.15
elves
0.15
ead
0.15
itself
0.15
imar
0.14
èĩªå·±
0.14
Activations Density 0.112%