INDEX
    Explanations

    phrases that highlight inequalities or issues related to women's employment and physical capabilities

    New Auto-Interp
    Negative Logits
     myſelf
    -1.15
     itſelf
    -1.05
     Efq
    -1.03
    berdayakan
    -1.00
    ſelf
    -0.99
    Portale
    -0.98
     himſelf
    -0.97
     Chriftian
    -0.97
     uſed
    -0.96
     Jefus
    -0.95
    POSITIVE LOGITS
     fa
    0.63
     c
    0.61
     con
    0.60
     ha
    0.60
     si
    0.60
     hu
    0.60
     wo
    0.59
     n
    0.59
     le
    0.57
     w
    0.57
    Act Density 0.262%

    No Known Activations