INDEX
Explanations
references to women's empowerment and rights
New Auto-Interp
Negative Logits
himself
-0.17
vale
-0.15
ideo
-0.15
ensis
-0.14
ÑĢеб
-0.14
toolbox
-0.14
his
-0.14
Himself
-0.14
_Insert
-0.13
wiki
-0.13
POSITIVE LOGITS
women
0.27
Women
0.25
herself
0.22
Woman
0.21
Women
0.21
woman
0.20
ladies
0.20
women
0.19
ä¸Ī夫
0.19
Womens
0.18
Activations Density 0.521%