INDEX
Explanations
pronouns and possessive pronouns related to a female subject
references to a female subject
New Auto-Interp
Negative Logits
vernment
-0.69
undo
-0.63
escription
-0.63
hovah
-0.63
anamo
-0.60
homosexuals
-0.59
Reply
-0.59
ornia
-0.59
govtrack
-0.58
emetery
-0.58
POSITIVE LOGITS
pher
1.41
athed
1.18
ding
1.02
lled
1.02
athing
1.01
pard
0.99
ikh
0.96
cule
0.96
pherd
0.88
ldon
0.88
Activations Density 0.180%