INDEX
Explanations
names of locations and people, specifically in relation to Chinese geography and culture
New Auto-Interp
Negative Logits
idges
-0.18
iras
-0.15
ulet
-0.15
Slack
-0.15
Nguyen
-0.14
etler
-0.14
ois
-0.14
cky
-0.14
preter
-0.13
oppel
-0.13
POSITIVE LOGITS
xi
0.26
shan
0.24
'an
0.23
gang
0.22
’an
0.22
bian
0.21
cheng
0.20
County
0.19
long
0.19
men
0.19
Activations Density 0.029%