INDEX
Explanations
context related to sports, pregnant women, or clothing
terms related to women
New Auto-Interp
Negative Logits
boys
-1.40
male
-1.37
boy
-1.33
Boys
-1.27
boy
-1.25
boys
-1.25
masculina
-1.21
BOYS
-1.21
males
-1.20
male
-1.20
POSITIVE LOGITS
women
1.37
Women
1.19
woman
1.18
women
1.15
Women
1.13
Woman
1.04
WOMEN
1.04
woman
1.03
WOMEN
0.96
female
0.92
Activations Density 1.631%