INDEX
    Explanations

    references to gender and women's issues

    New Auto-Interp
    Negative Logits
     Woman
    -0.26
     woman
    -0.25
    Woman
    -0.25
    woman
    -0.24
     женÑīина
    -0.22
     Womens
    -0.22
     Women
    -0.20
     mulher
    -0.19
     vrouw
    -0.18
    女人
    -0.18
    POSITIVE LOGITS
     men
    0.34
    -men
    0.25
    men
    0.25
     children
    0.25
     Men
    0.23
     gentlemen
    0.23
    Men
    0.21
    children
    0.21
     Children
    0.20
     hommes
    0.20
    Act Density 0.029%

    No Known Activations