INDEX
    Explanations

    mentions of the term "white"

    references to the term 'white'

    New Auto-Interp
    Negative Logits
    =-=-=-=-
    -0.85
    yrinth
    -0.82
    HCR
    -0.81
    SIGN
    -0.78
    cffffcc
    -0.77
    interstitial
    -0.75
    ategory
    -0.74
    itual
    -0.74
    =-=-
    -0.73
    Allow
    -0.73
    POSITIVE LOGITS
     supremacist
    1.24
     supremacists
    1.08
     nationalist
    0.98
     suprem
    0.94
     white
    0.89
     supremacy
    0.87
     elephant
    0.86
    berry
    0.84
     nationalists
    0.83
    caps
    0.82
    Act Density 0.022%

    No Known Activations