INDEX
    Explanations

    references to African American identity and related terms

    New Auto-Interp
    Negative Logits
    illow
    -0.09
    utsch
    -0.07
    ramework
    -0.07
    ëģĶ
    -0.07
    tuk
    -0.07
    ixa
    -0.07
    hog
    -0.07
    yu
    -0.07
    oning
    -0.07
    worthy
    -0.07
    POSITIVE LOGITS
    ÑĤÑĮ
    0.07
    ized
    0.07
    ität
    0.07
    adır
    0.07
    isation
    0.06
    -Muslim
    0.06
    /black
    0.06
    ization
    0.06
    ohn
    0.06
    usement
    0.06
    Act Density 0.009%

    No Known Activations