INDEX
Explanations
words related to the country of Japan
words related to ethnicities or national identities
New Auto-Interp
Negative Logits
azine
-0.83
ihar
-0.80
ilater
-0.69
Kimber
-0.68
Cumm
-0.64
Diesel
-0.64
alist
-0.64
icals
-0.64
razil
-0.63
izons
-0.62
POSITIVE LOGITS
hiro
0.84
ktop
0.83
agues
0.81
clair
0.79
ploy
0.75
uth
0.75
lect
0.75
ppe
0.74
wei
0.74
heng
0.73
Activations Density 0.036%