INDEX
Explanations
words related to places or locations, specifically focusing on mentions of "New York"
references to the state of New York
New Auto-Interp
Negative Logits
xual
-0.80
poke
-0.79
warts
-0.78
arching
-0.78
uddin
-0.77
actionGroup
-0.75
byss
-0.74
ï¸ı
-0.73
Reloaded
-0.73
ascript
-0.73
POSITIVE LOGITS
York
1.39
Zealand
1.23
Orleans
1.19
Hampshire
1.18
Yorker
1.16
Jersey
1.15
Yorkers
1.06
Brunswick
1.00
YORK
0.97
foundland
0.93
Activations Density 0.050%