INDEX
Explanations
phrases mentioning the demographic category "men" and related concepts such as masculinity, gender issues, and sexism
New Auto-Interp
Negative Logits
IVERS
-0.93
Deal
-0.74
Closure
-0.67
Assembly
-0.65
Berry
-0.65
aminer
-0.64
REDACTED
-0.64
Democratic
-0.64
Ward
-0.63
Main
-0.63
POSITIVE LOGITS
volent
1.43
opausal
1.34
endez
1.19
ager
1.08
aced
1.08
aces
1.05
orah
0.97
folk
0.94
uscript
0.93
ial
0.86
Activations Density 0.452%