INDEX
    Explanations

    phrases related to gender equality and women's empowerment

    New Auto-Interp
    Negative Logits
    inos
    -0.16
    industry
    -0.15
     vanity
    -0.14
     Caucasian
    -0.14
    american
    -0.14
     interracial
    -0.14
    OTA
    -0.13
    /MIT
    -0.13
     Coc
    -0.13
    elines
    -0.13
    POSITIVE LOGITS
     violence
    0.27
     GB
    0.27
     Violence
    0.24
     SR
    0.22
    GB
    0.22
     rights
    0.21
    -viol
    0.21
     Sexual
    0.20
     sexual
    0.20
    Viol
    0.20
    Act Density 0.055%

    No Known Activations