INDEX
Explanations
mentions of the city of New York
references to New York City
New Auto-Interp
Negative Logits
tle
-0.91
ibilities
-0.88
rar
-0.83
ãĥ¼ãĥĨ
-0.82
wcsstore
-0.82
elines
-0.81
etsk
-0.79
lyak
-0.78
ãĥ¼ãĤ¯
-0.77
CHAT
-0.76
POSITIVE LOGITS
Council
0.96
scape
0.92
District
0.89
skyline
0.88
Opera
0.87
Orchestra
0.84
Police
0.83
Mistress
0.83
Department
0.82
landmarks
0.80
Activations Density 0.023%