INDEX
Explanations
mentions of specific racial and ethnic identities in a political context
New Auto-Interp
Negative Logits
rikes
-0.17
lesc
-0.15
Ĩ
-0.14
oden
-0.14
udget
-0.13
.sax
-0.13
款
-0.13
azen
-0.13
oppins
-0.13
Korea
-0.13
POSITIVE LOGITS
descent
0.39
decent
0.37
heritage
0.31
-desc
0.31
-des
0.31
ancestry
0.26
è£
0.25
-background
0.25
-American
0.25
-Americans
0.25
Activations Density 0.078%