INDEX
Explanations
highlights related to specific nationalities
references to nationalities, particularly French, Chinese, and Japanese
New Auto-Interp
Negative Logits
ctic
-0.87
odder
-0.84
ueller
-0.83
lier
-0.78
affles
-0.78
vable
-0.78
inery
-0.78
utherford
-0.77
izons
-0.77
idem
-0.77
POSITIVE LOGITS
Nadu
0.98
proverb
0.94
oslov
0.94
nationals
0.84
cuisine
0.84
pronunciation
0.82
immigrants
0.81
apolis
0.80
Yen
0.80
immigrant
0.80
Activations Density 0.130%