INDEX
Explanations
names of cities or locations
references to the word "New" in various contexts, particularly related to location
New Auto-Interp
Negative Logits
uca
-0.87
ONSORED
-0.84
xual
-0.83
ILA
-0.71
alkyrie
-0.71
acebook
-0.68
ufact
-0.68
poke
-0.68
pless
-0.67
abetic
-0.67
POSITIVE LOGITS
foundland
1.21
bies
1.19
bie
1.18
York
1.13
Zealand
1.12
Orleans
1.02
arrivals
1.00
Hampshire
0.95
castle
0.95
YORK
0.94
Activations Density 0.054%