INDEX
Explanations
references to male characters or figures
men or husbands
male and female people
New Auto-Interp
Negative Logits
nakalista
-0.99
^(@)
-0.96
expandindo
-0.96
EconPapers
-0.93
محفوظة
-0.93
snippetHide
-0.92
']")
-0.87
})));
-0.87
Escolar
-0.85
Ramadhan
-0.85
POSITIVE LOGITS
men
0.87
Men
0.79
who
0.78
Man
0.68
MEN
0.67
man
0.67
Men
0.62
men
0.62
Woman
0.54
Man
0.53
Activations Density 0.087%