INDEX
Explanations
gender-related characteristics and attributes, such as masculinity and femininity
terms that represent gender concepts, specifically masculinity and femininity
New Auto-Interp
Negative Logits
oulos
-0.83
Assembly
-0.81
rocket
-0.75
Deal
-0.74
owitz
-0.72
Money
-0.68
Redemption
-0.68
RAY
-0.68
DVD
-0.67
REL
-0.66
POSITIVE LOGITS
hygiene
0.96
masculine
0.90
feminine
0.88
pronouns
0.87
femin
0.82
istries
0.80
masculinity
0.79
inity
0.78
mascul
0.77
WithNo
0.76
Activations Density 0.036%