INDEX
    Explanations

    references to locations, specifically the mention of "New York City"

    New Auto-Interp
    Negative Logits
    rar
    -0.80
    terness
    -0.78
    hement
    -0.75
    riet
    -0.75
    icum
    -0.73
    iru
    -0.72
    nir
    -0.71
     scrut
    -0.71
    igham
    -0.70
    phabet
    -0.70
    POSITIVE LOGITS
     subway
    1.09
     borough
    1.00
     FC
    0.97
     skyline
    0.97
    scape
    0.95
     landmarks
    0.93
     neighborhoods
    0.92
     Mayor
    0.92
     skysc
    0.91
     streets
    0.91
    Act Density 0.049%

    No Known Activations