INDEX
    Explanations

    instances of racism

    terms related to racism and its manifestations

    New Auto-Interp
    Negative Logits
    icular
    -0.83
    Delivery
    -0.81
    pad
    -0.80
    earchers
    -0.79
    imen
    -0.77
    irs
    -0.75
    ITNESS
    -0.74
    amina
    -0.73
    Pad
    -0.72
    Vs
    -0.71
    POSITIVE LOGITS
     slurs
    1.06
     prejudice
    0.99
     stereotyp
    0.84
     hatred
    0.82
     racists
    0.82
     racist
    0.81
    ethnic
    0.80
     racism
    0.79
     stereotypes
    0.78
     nationalist
    0.78
    Act Density 0.023%

    No Known Activations