INDEX
    Explanations

    references to gender, particularly males and females

    New Auto-Interp
    Negative Logits
     Notion
    -0.81
     Inscription
    -0.69
     kasarigan
    -0.68
     ?>">
    -0.67
     }}"></
    -0.67
     Krise
    -0.66
    dill
    -0.66
    شهاد
    -0.65
    obatan
    -0.64
     hut
    -0.63
    POSITIVE LOGITS
     Male
    1.76
     male
    1.72
     MALE
    1.66
    Male
    1.59
    MALE
    1.59
     FEMALE
    1.52
     males
    1.48
     female
    1.48
    male
    1.47
     Female
    1.45
    Act Density 0.100%

    No Known Activations