INDEX
Explanations
possessive pronouns and their objects
New Auto-Interp
Negative Logits
ourselves
0.57
yourselves
0.56
자신
0.52
myself
0.51
ตัวเอง
0.51
oneself
0.47
自分
0.46
себе
0.46
itself
0.46
Himself
0.45
POSITIVE LOGITS
hobbies
0.49
அவர்களுடைய
0.47
댁
0.46
opinion
0.46
gardener
0.45
favorite
0.45
favourite
0.44
ريخ
0.44
belongings
0.44
prized
0.44
Activations Density 0.020%