INDEX
    Explanations

    references to white supremacist groups such as the Ku Klux Klan (KKK) and related terms

    mentions and references to hate groups, specifically the Ku Klux Klan and related organizations

    New Auto-Interp
    Negative Logits
    ochond
    -0.78
    RW
    -0.77
    phrine
    -0.77
    Downloadha
    -0.77
    neau
    -0.76
    cially
    -0.73
    */(
    -0.72
    cing
    -0.71
    lessly
    -0.69
    ably
    -0.68
    POSITIVE LOGITS
     Klux
    1.34
     Klan
    1.31
     KKK
    1.06
     affili
    0.77
     robes
    0.77
     supremacist
    0.76
     supremacists
    0.74
     affiliation
    0.74
     NAACP
    0.73
     imperson
    0.72
    Act Density 0.009%

    No Known Activations