INDEX
Explanations
possessive pronouns indicating personal ownership
New Auto-Interp
Negative Logits
ington
-0.16
ignum
-0.14
atti
-0.14
ering
-0.14
ord
-0.13
ially
-0.13
ords
-0.12
ness
-0.12
ERING
-0.12
/OR
-0.12
POSITIVE LOGITS
own
0.50
/her
0.33
próp
0.29
Own
0.28
Own
0.28
own
0.28
self
0.25
SELF
0.25
zelf
0.25
respective
0.24
Activations Density 0.959%