INDEX
    Explanations

    words related to racial or ethnic identity

    references to the word "ir."

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.76
    İĭ
    -0.72
     CLSID
    -0.72
     Vinyl
    -0.69
     Avalon
    -0.68
    erker
    -0.66
     Canary
    -0.66
    plain
    -0.64
    YC
    -0.63
     Patreon
    -0.63
    POSITIVE LOGITS
    rha
    1.06
    vana
    1.00
    andom
    0.90
    cles
    0.88
    ROR
    0.83
    onda
    0.82
    ilateral
    0.82
    ror
    0.82
    ashi
    0.80
    respective
    0.80
    Act Density 0.013%

    No Known Activations