INDEX
    Explanations

    terms related to racial identity and discrimination

    topics related to race and ethnicity

    New Auto-Interp
    Negative Logits
    erva
    -0.84
    unction
    -0.78
    UNE
    -0.77
    cit
    -0.76
    irs
    -0.75
    uden
    -0.74
    ushima
    -0.74
    orage
    -0.74
     Mub
    -0.74
    unker
    -0.73
    POSITIVE LOGITS
    course
    1.03
     Equality
    0.86
    horse
    0.81
    blind
    0.78
     slurs
    0.78
     prejudice
    0.77
    bending
    0.76
    hair
    0.76
     Discrimination
    0.75
     relations
    0.74
    Act Density 0.018%

    No Known Activations