INDEX
    Explanations

    references to specific locations, particularly New York City

    mentions of "New York City."

    New Auto-Interp
    Negative Logits
    rar
    -0.91
    iru
    -0.79
    ãĥł
    -0.78
    VALUE
    -0.76
    Wan
    -0.76
    insk
    -0.75
    ãĥĦ
    -0.74
    20439
    -0.74
    ãĥ¼ãĥĨ
    -0.74
    iago
    -0.73
    POSITIVE LOGITS
     landmarks
    0.96
     borough
    0.94
     subway
    0.90
     FC
    0.88
     streets
    0.85
     Orchestra
    0.83
     skyline
    0.82
     neighborhoods
    0.82
     Opera
    0.82
     skysc
    0.82
    Act Density 0.097%

    No Known Activations