INDEX
Explanations
references to African Americans and their experiences
New Auto-Interp
Negative Logits
ÙĨداÙĨ
-0.17
lasses
-0.16
mez
-0.16
jom
-0.15
rid
-0.15
oltip
-0.15
äd
-0.14
stab
-0.14
kova
-0.14
jar
-0.14
POSITIVE LOGITS
-American
0.33
-Americans
0.28
American
0.24
Americans
0.23
descent
0.22
American
0.21
descended
0.20
-desc
0.20
è£
0.19
american
0.19
Activations Density 0.015%