INDEX
Explanations
pronouns referring to a female subject
references to a central female character or subject
New Auto-Interp
Negative Logits
natureconservancy
-0.68
»
-0.66
expense
-0.65
OTAL
-0.65
merce
-0.64
avail
-0.62
Appearance
-0.62
margins
-0.60
oday
-0.60
wilderness
-0.60
POSITIVE LOGITS
guard
0.90
endant
0.85
endants
0.84
mented
0.74
Catal
0.72
hire
0.71
ovan
0.71
ä¸Ĭ
0.68
shire
0.66
nard
0.65
Activations Density 0.000%