INDEX
    Explanations

    references to theaters or similar entertainment venues

    references to different theater establishments and productions

    New Auto-Interp
    Negative Logits
    doms
    -0.83
    imo
    -0.79
    luent
    -0.75
    agher
    -0.72
    nir
    -0.69
    nesty
    -0.69
    itive
    -0.68
    yg
    -0.66
    plets
    -0.66
    unin
    -0.65
    POSITIVE LOGITS
    goers
    1.21
     marqu
    1.05
     productions
    0.97
     theatre
    0.90
     theater
    0.88
     Royale
    0.87
     trou
    0.87
     theaters
    0.83
     halls
    0.82
    wright
    0.81
    Act Density 0.076%

    No Known Activations