INDEX
Explanations
mentions of the color white, especially in a social or political context
references to "white" in the context of race and social issues
New Auto-Interp
Negative Logits
yrinth
-1.08
cffffcc
-0.96
ategory
-0.87
HCR
-0.83
rocal
-0.80
INGTON
-0.79
itual
-0.77
udeau
-0.76
Downloadha
-0.76
SIGN
-0.76
POSITIVE LOGITS
supremacist
1.39
supremacists
1.20
supremacy
1.04
nationalist
1.01
bread
1.00
elephant
0.94
suprem
0.89
beard
0.85
house
0.83
male
0.83
Activations Density 0.029%