INDEX
    Explanations

    phrases related to patriarchal values and gender roles

    New Auto-Interp
    Negative Logits
     racial
    -0.17
    racial
    -0.17
     Stam
    -0.14
     Ethnic
    -0.14
    volatile
    -0.14
     jac
    -0.14
     homosexual
    -0.14
     ethnic
    -0.14
    jay
    -0.14
    682
    -0.14
    POSITIVE LOGITS
     girls
    0.17
     society
    0.16
    adox
    0.15
     Girls
    0.14
    lev
    0.14
    λια
    0.14
    boys
    0.14
     Trophy
    0.14
     boys
    0.14
    .Dom
    0.14
    Act Density 0.059%

    No Known Activations