INDEX
    Explanations

    general instances of violence, specifically mentioning stabbing incidents

    references to violent acts involving stabbing

    New Auto-Interp
    Negative Logits
    quickShipAvailable
    -0.74
    VO
    -0.71
    gran
    -0.70
    aut
    -0.69
    avez
    -0.66
    uph
    -0.65
    mberg
    -0.65
     Moder
    -0.64
    mie
    -0.64
    oS
    -0.64
    POSITIVE LOGITS
     stabbed
    1.04
     stabbing
    1.00
     stab
    0.87
    nesday
    0.85
     dagger
    0.84
     slit
    0.81
     spree
    0.81
    lished
    0.80
     wounds
    0.80
     knife
    0.78
    Act Density 0.006%

    No Known Activations