INDEX
Explanations
adjectives related to gender, specifically "masculine" and "feminine"
traits, characteristics, and behaviors associated with masculinity and femininity
references to gender norms and characteristics associated with masculinity and femininity
New Auto-Interp
Negative Logits
akings
-0.82
Assembly
-0.81
Deal
-0.78
oulos
-0.78
EVA
-0.77
Rush
-0.72
EV
-0.71
Money
-0.70
BUR
-0.68
Bans
-0.67
POSITIVE LOGITS
masculine
1.14
masculinity
1.11
mascul
1.00
feminine
0.98
femin
0.93
adolesc
0.80
hygiene
0.80
streng
0.77
pronouns
0.77
stereotyp
0.76
Activations Density 0.015%