INDEX
Explanations
words related to racial and ethnic demographics
New Auto-Interp
Negative Logits
Jeografia
-0.83
ohjel
-0.56
Италијани
-0.53
CreateTagHelper
-0.52
featureID
-0.51
Prisoner
-0.51
الوطنيه
-0.50
perative
-0.50
Dario
-0.50
viewDid
-0.49
POSITIVE LOGITS
white
1.99
white
1.74
White
1.67
White
1.60
WHITE
1.54
whites
1.43
WHITE
1.37
whiteness
1.33
白
1.30
blancs
1.24
Activations Density 0.317%