INDEX
    Explanations

    references to violent incidents, specifically shootings

    references to incidents of gun violence

    New Auto-Interp
    Negative Logits
    hw
    -0.82
    Label
    -0.79
    undai
    -0.74
    ebook
    -0.73
     Yok
    -0.72
    ateg
    -0.71
    akuya
    -0.70
    MpServer
    -0.70
    GY
    -0.70
    ulla
    -0.69
    POSITIVE LOGITS
     shooting
    1.03
     spree
    1.01
     shoot
    0.93
     Shooting
    0.84
    powder
    0.82
    nikov
    0.80
     shoots
    0.80
     Shoot
    0.79
     gallery
    0.78
     fireworks
    0.77
    Act Density 0.016%

    No Known Activations