INDEX
Explanations
female pronouns and words like daughter and husband that refer to women
New Auto-Interp
Negative Logits
her
-4.25
her
-2.48
hers
-2.25
彼女の
-2.13
herself
-2.11
她的
-2.05
그녀
-2.02
hennes
-2.00
haar
-1.91
ее
-1.89
POSITIVE LOGITS
betreft
0.63
maxn
0.62
socialista
0.59
pushd
0.57
rawan
0.56
elä
0.56
görünü
0.56
vettoriale
0.56
drawSprites
0.56
tiegħ
0.55
Activations Density 4.725%