INDEX
    Explanations

    phrases mentioning both men and women

    references to gender, particularly focusing on the roles and presence of women in various contexts

    word2 in "word1 and word2", especially when the words are conceptually related

    Explanation Uploaded by User
    New Auto-Interp
    Negative Logits
    Joy
    -0.71
    Thunder
    -0.68
    MN
    -0.68
    Dur
    -0.68
    Kevin
    -0.67
    Tok
    -0.67
    afety
    -0.66
    San
    -0.66
    GREEN
    -0.65
    Inv
    -0.65
    POSITIVE LOGITS
     alike
    1.57
     striped
    1.00
     respectively
    0.99
     combatants
    0.85
     halves
    0.69
     faiths
    0.67
     sexes
    0.66
     separated
    0.66
     equally
    0.66
     separately
    0.65
    Act Density 0.131%

    No Known Activations