INDEX
Explanations
mentions of different ethnic groups or conflicts related to ethnicity
references to ethnic groups and their characteristics or experiences
New Auto-Interp
Negative Logits
uden
-0.89
soDeliveryDate
-0.87
ertodd
-0.84
tower
-0.84
agher
-0.83
awaru
-0.80
DERR
-0.73
inventoryQuantity
-0.73
OHN
-0.73
ilton
-0.72
POSITIVE LOGITS
cleansing
1.30
minorities
1.24
ities
1.22
minority
1.05
groups
0.97
profiling
0.89
nationalists
0.89
slurs
0.89
nationalism
0.88
diversity
0.87
Activations Density 0.045%