INDEX
    Explanations

    references to gender, specifically men, in relation to societal norms and stereotypes

    references to men and their comparative roles or attributes in society

    New Auto-Interp
    Negative Logits
    ITS
    -0.81
    Assembly
    -0.80
    UGE
    -0.78
    REC
    -0.75
    UFF
    -0.75
    Ward
    -0.73
    IVERS
    -0.70
    OWN
    -0.70
    REDACTED
    -0.67
    Burn
    -0.67
    POSITIVE LOGITS
    volent
    1.08
    opausal
    1.03
     ejac
    0.94
     genitals
    0.93
    folk
    0.86
    ager
    0.79
    icide
    0.79
     friendships
    0.77
     sexually
    0.72
     pronouns
    0.72
    Act Density 0.110%

    No Known Activations