INDEX
Explanations
gender-specific mentions, with a focus on instances involving men
references to gender, specifically focusing on men
New Auto-Interp
Negative Logits
Assembly
-0.91
ITS
-0.81
Closure
-0.75
UFF
-0.73
Clean
-0.72
Canaver
-0.72
Manufact
-0.71
IVERS
-0.69
Mount
-0.68
Ground
-0.68
POSITIVE LOGITS
volent
1.13
folk
0.91
opausal
0.91
dominated
0.74
friendships
0.74
dominate
0.73
wiser
0.73
icide
0.73
hating
0.73
ejac
0.72
Activations Density 0.092%