INDEX
Explanations
terms related to different aspects of gender and transgender issues, including gender-diversity, gender neutrality, gender-based discrimination, and gender identity
terms related to gender identity and transgender issues
New Auto-Interp
Negative Logits
hiba
-0.83
pload
-0.78
helicop
-0.74
akings
-0.71
Cheap
-0.69
hao
-0.67
CrossRef
-0.64
shire
-0.62
uto
-0.62
Money
-0.61
POSITIVE LOGITS
slurs
1.04
pronouns
0.94
genders
0.92
gender
0.91
sexuality
0.90
stereotypes
0.87
marriage
0.87
genital
0.83
discrimination
0.83
rimination
0.82
Activations Density 0.285%