INDEX
Explanations
references to women in various contexts
New Auto-Interp
Negative Logits
ypes
-0.95
raltar
-0.84
kefeller
-0.80
emetery
-0.78
quickShipAvailable
-0.77
Flavoring
-0.77
ernels
-0.77
agascar
-0.75
incinn
-0.75
UFF
-0.74
POSITIVE LOGITS
izer
1.21
hood
1.11
herself
1.09
pher
0.98
vagina
0.93
pregnant
0.91
breastfeeding
0.91
folk
0.90
cule
0.90
Louise
0.89
Activations Density 0.039%