INDEX
    Explanations

    statements or phrases related to racism, specifically when the term "racist" is mentioned or implied

    references to racism and racist behavior

    New Auto-Interp
    Negative Logits
    ITNESS
    -0.85
    icular
    -0.82
    pad
    -0.81
    amina
    -0.79
    Delivery
    -0.72
    eenth
    -0.70
    ATURE
    -0.70
    marks
    -0.70
    ieth
    -0.70
    imen
    -0.70
    POSITIVE LOGITS
     slurs
    1.18
     prejudice
    0.94
     stereotypes
    0.93
     stereotyp
    0.91
     slur
    0.91
     stereotype
    0.89
     tir
    0.86
     racist
    0.80
     racists
    0.80
     bigot
    0.80
    Act Density 0.050%

    No Known Activations