INDEX
    Explanations

    references to gender, particularly focusing on male identities and issues related to masculinity

    New Auto-Interp
    Negative Logits
     guys
    -0.23
     Guys
    -0.21
     boys
    -0.19
     men
    -0.19
    ners
    -0.18
     Boys
    -0.17
    edly
    -0.15
    rei
    -0.15
     guy
    -0.15
     hombres
    -0.15
    POSITIVE LOGITS
    volent
    0.45
    -dominated
    0.36
    fic
    0.33
    /f
    0.28
    factor
    0.28
    -bodied
    0.28
    -only
    0.28
    -centric
    0.26
    -led
    0.26
    vol
    0.25
    Act Density 0.021%

    No Known Activations