INDEX
    Explanations

    references to gender, specifically male and female

    "Male" or "female" classifications

    New Auto-Interp
    Negative Logits
    ostante
    -0.79
    RenderAtEndOf
    -0.78
    queryInterface
    -0.70
     Италијани
    -0.68
     Мексичка
    -0.68
    webElementXpaths
    -0.67
     hâte
    -0.66
    seamnă
    -0.65
     Muses
    -0.64
     propOrder
    -0.63
    POSITIVE LOGITS
     blu
    0.77
     male
    0.72
     bl
    0.72
    blu
    0.71
     Bla
    0.70
     BL
    0.69
     Blu
    0.69
     bla
    0.69
     Bl
    0.69
    BLU
    0.67
    Act Density 0.115%

    No Known Activations