INDEX
    Explanations

    terms related to ethnicity and race

    New Auto-Interp
    Negative Logits
    iero
    -0.18
    른
    -0.17
    aries
    -0.15
    μαÏĦα
    -0.15
     Chop
    -0.15
    fak
    -0.14
    yun
    -0.14
    ylon
    -0.14
    zen
    -0.14
    illance
    -0.14
    POSITIVE LOGITS
     minority
    0.21
     minorities
    0.21
     Minority
    0.19
    -specific
    0.19
     cleansing
    0.18
    /r
    0.18
    ities
    0.17
    oot
    0.17
     Cleans
    0.16
    оналÑĮ
    0.16
    Act Density 0.019%

    No Known Activations