INDEX
Explanations
references to different nationalities and ethnic groups
New Auto-Interp
Negative Logits
kasarigan
-0.69
baa
-0.53
ciel
-0.51
въ
-0.50
stet
-0.50
atra
-0.50
Adi
-0.48
crop
-0.48
Aziz
-0.47
벌
-0.47
POSITIVE LOGITS
Canadians
1.25
Americans
1.24
Americans
1.20
Nigerians
1.13
Australians
1.09
Britons
1.07
Israelis
1.06
Kenyans
1.04
Ghanaians
1.04
Koreans
1.03
Activations Density 0.278%