INDEX
Explanations
references to specific cities and city-related contexts
New Auto-Interp
Negative Logits
odi
-0.21
egrator
-0.15
McCabe
-0.15
ilde
-0.15
775
-0.14
ỹ
-0.14
bishop
-0.14
ording
-0.14
emachine
-0.14
099
-0.13
POSITIVE LOGITS
zen
0.20
zens
0.19
wide
0.18
wide
0.17
veal
0.16
slick
0.16
Hall
0.15
Wide
0.15
zung
0.14
Wide
0.14
Activations Density 0.026%