INDEX
    Explanations

    phrases related to discrimination and bias based on various characteristics such as race, gender, sexual orientation, and nationality

    references to discrimination based on various characteristics such as race, sexual orientation, or disability

    New Auto-Interp
    Negative Logits
    eva
    -0.82
    iland
    -0.77
    jet
    -0.75
    cca
    -0.69
    Guard
    -0.67
    MO
    -0.66
    adia
    -0.65
    cember
    -0.64
    NAS
    -0.64
    jan
    -0.64
    POSITIVE LOGITS
     ethnicity
    1.24
     nationality
    1.18
     merit
    1.13
     whether
    1.07
     race
    1.05
     gender
    1.04
     characteristics
    1.03
     geography
    1.02
     demographics
    1.01
     similarity
    1.00
    Act Density 0.245%

    No Known Activations