INDEX
Explanations
locations or proper nouns related to cities or towns
mentions of geographic locations, particularly cities
New Auto-Interp
Negative Logits
aith
-0.78
Gra
-0.72
oud
-0.70
bip
-0.69
Bow
-0.69
Dob
-0.69
od
-0.68
Cy
-0.67
Kay
-0.66
ae
-0.66
POSITIVE LOGITS
Shanghai
3.16
ghai
1.48
Valencia
1.35
Deng
1.28
Disneyland
1.13
Qin
1.07
yuan
1.05
Cinderella
1.05
Mandarin
1.03
Guang
0.97
Activations Density 0.025%