INDEX
    Explanations

    phrases mentioning the demographic category "men" and related concepts such as masculinity, gender issues, and sexism

    New Auto-Interp
    Negative Logits
    IVERS
    -0.93
    Deal
    -0.74
    Closure
    -0.67
    Assembly
    -0.65
    Berry
    -0.65
    aminer
    -0.64
    REDACTED
    -0.64
    Democratic
    -0.64
    Ward
    -0.63
    Main
    -0.63
    POSITIVE LOGITS
    volent
    1.43
    opausal
    1.34
    endez
    1.19
    ager
    1.08
    aced
    1.08
    aces
    1.05
    orah
    0.97
    folk
    0.94
    uscript
    0.93
    ial
    0.86
    Act Density 0.452%

    No Known Activations