INDEX
Explanations
proper nouns related to geography, organizations, and titles
references to specific countries and their abbreviations
New Auto-Interp
Negative Logits
ischer
-0.77
Horus
-0.76
epad
-0.73
..........
-0.72
hement
-0.71
cgi
-0.69
efully
-0.69
ochet
-0.68
nels
-0.67
anwhile
-0.67
POSITIVE LOGITS
citiz
0.97
citizen
0.95
citizens
0.93
cities
0.82
households
0.82
residents
0.81
employees
0.80
towns
0.77
sylv
0.76
twent
0.76
Activations Density 0.237%