INDEX
Explanations
mentions of women and female-related topics
New Auto-Interp
Negative Logits
înc
-0.82
limus
-0.71
ValueGenerated
-0.69
Tare
-0.67
Dade
-0.66
slu
-0.65
Idy
-0.65
})_{-0.64
idelberg
-0.64
pendium
-0.63
POSITIVE LOGITS
women
1.38
Women
1.31
woman
1.27
women
1.27
Women
1.25
Woman
1.21
WOMEN
1.21
WOMAN
1.18
Woman
1.16
woman
1.14
Activations Density 0.041%