INDEX
Explanations
mentions of geographical locations, particularly cities
New Auto-Interp
Negative Logits
che
-0.15
Hin
-0.15
Executors
-0.14
idor
-0.14
cke
-0.14
ongsTo
-0.13
Inspector
-0.13
ÑĪив
-0.13
urge
-0.13
Ngb
-0.13
POSITIVE LOGITS
shire
0.19
-based
0.17
-area
0.17
gov
0.16
anism
0.15
-centric
0.14
.gov
0.14
zdy
0.14
-first
0.14
ian
0.14
Activations Density 0.117%