INDEX
Explanations
references to female characters and their relationships in narratives
New Auto-Interp
Negative Logits
his
-1.81
himself
-1.74
his
-1.49
himself
-1.39
彼は
-1.29
he
-1.25
彼の
-1.22
seinem
-1.19
seinen
-1.17
seine
-1.14
POSITIVE LOGITS
herself
2.18
herself
1.65
her
1.20
she
1.14
حياتها
1.08
ihrer
1.07
ihrem
1.07
acompañada
1.02
hennes
1.01
نفسها
1.01
Activations Density 0.187%