INDEX
    Explanations

    references to gender or gender-related terms, particularly focusing on the term "male"

    mentions of male and female categories

    New Auto-Interp
    Negative Logits
    rieg
    -0.79
    bley
    -0.79
    leans
    -0.78
    heet
    -0.77
    lay
    -0.75
    ovie
    -0.75
    hemy
    -0.74
    ingen
    -0.73
    weet
    -0.73
    Deal
    -0.72
    POSITIVE LOGITS
    volent
    1.75
     genital
    1.28
    vol
    0.99
     anatomy
    0.91
     genitals
    0.89
     infertility
    0.87
     counterparts
    0.87
     circumcision
    0.86
     reproductive
    0.85
     gaze
    0.85
    Act Density 0.068%

    No Known Activations