INDEX
Explanations
possessive pronouns and their associated references
New Auto-Interp
Negative Logits
ernote
-0.17
Wunused
-0.16
xdd
-0.15
_mD
-0.15
вÑģп
-0.14
udu
-0.14
grandchildren
-0.14
ellas
-0.14
(éĩij
-0.14
Jaune
-0.14
POSITIVE LOGITS
former
0.23
0.20
fellow
0.19
friend
0.18
aho
0.18
co
0.17
another
0.17
colleague
0.16
previous
0.16
ada
0.16
Activations Density 0.065%