INDEX
Explanations
references to women and their experiences or empowerment
"women" or "woman"
New Auto-Interp
Negative Logits
[
-0.44
des
-0.44
Sanford
-0.43
Sykes
-0.42
roy
-0.41
Peralta
-0.41
Coronation
-0.41
Stark
-0.40
Smy
-0.40
DES
-0.40
POSITIVE LOGITS
Woman
1.13
Woman
1.09
woman
1.09
WOMAN
1.05
Women
1.05
women
1.02
Women
1.00
WOMAN
0.98
WOMEN
0.96
women
0.95
Activations Density 0.049%