INDEX
    Explanations

    mentions of men and gender-related terms

    New Auto-Interp
    Negative Logits
    ')")
    -0.95
    ']")
    -0.95
    ]')
    -0.86
    — 
    -0.86
    الإنجليزية
    -0.84
     ligiloj
    -0.83
     ]]
    -0.82
    "");
    -0.81
    $")
    -0.80
    ).)
    -0.78
    POSITIVE LOGITS
     men
    3.37
     Men
    3.15
    Men
    3.01
     MEN
    2.79
    men
    2.59
    MEN
    2.31
     hommes
    1.92
     Männer
    1.86
     hombres
    1.85
     mens
    1.76
    Act Density 0.064%

    No Known Activations