INDEX
    Explanations

    terms related to sexism, misogyny, and patriarchy

    references to misogyny and patriarchal themes

    New Auto-Interp
    Negative Logits
    ++++++++++++++++
    -0.80
     Solitaire
    -0.79
    Package
    -0.79
    xxxxxxxx
    -0.79
     Mint
    -0.76
    HER
    -0.71
    EVA
    -0.71
     Lent
    -0.69
    Assembly
    -0.69
    Lago
    -0.66
    POSITIVE LOGITS
     misogyn
    1.11
     misogyny
    0.99
    ogyn
    0.95
     sexist
    0.91
    ataka
    0.79
    volent
    0.76
     jokes
    0.74
    oir
    0.74
     offenders
    0.74
     stereotyp
    0.72
    Act Density 0.012%

    No Known Activations