INDEX
    Explanations

    references to specific ethnic or cultural groups and their characteristics

    New Auto-Interp
    Negative Logits
     Heller
    -0.19
    PELL
    -0.16
    atz
    -0.15
    ãĥ¼ãĥ³
    -0.15
    orrow
    -0.14
    UDGE
    -0.14
    mailer
    -0.14
    rá
    -0.14
    UNK
    -0.13
    evi
    -0.13
    POSITIVE LOGITS
    men
    0.26
    ic
    0.24
     Turk
    0.16
    meni
    0.16
    μεν
    0.16
    emen
    0.16
    ican
    0.15
    oman
    0.15
     Amen
    0.15
    iband
    0.15
    Act Density 0.008%

    No Known Activations