INDEX
Explanations
keywords related to gender and political references
references to women and discussions surrounding women's issues
New Auto-Interp
Negative Logits
compatible
-0.63
dotted
-0.61
scoring
-0.57
spect
-0.57
sensors
-0.57
enc
-0.57
needles
-0.57
options
-0.56
score
-0.56
orbital
-0.55
POSITIVE LOGITS
woman
4.32
women
2.79
Woman
2.10
men
2.05
person
1.96
girl
1.90
man
1.84
Woman
1.75
woman
1.63
mans
1.59
Activations Density 0.008%