INDEX
Explanations
possessive pronouns and personal pronouns indicating possession
possessive pronouns indicating personal relationships or familial connections
New Auto-Interp
Negative Logits
女
-0.81
ablishment
-0.79
among
-0.78
daq
-0.76
redits
-0.76
aneers
-0.76
ancial
-0.75
Ô
-0.74
ittees
-0.74
eware
-0.72
POSITIVE LOGITS
dad
1.58
mom
1.52
father
1.51
mother
1.49
parents
1.47
grandmother
1.46
classmates
1.45
roommate
1.42
aunt
1.41
classmate
1.34
Activations Density 0.294%