INDEX
    Explanations

    phrases related to physical harm, especially stabbing

    references to stabbing incidents or related injuries

    New Auto-Interp
    Negative Logits
     Cosmos
    -0.75
    oS
    -0.72
    mberg
    -0.70
     Geographic
    -0.69
    XM
    -0.68
     Afric
    -0.64
     Organization
    -0.64
    AMA
    -0.63
     Leban
    -0.63
    VO
    -0.63
    POSITIVE LOGITS
     wounds
    1.05
    lished
    0.98
     slit
    0.89
     throats
    0.84
    nery
    0.82
     rampage
    0.82
     stabbing
    0.81
     stab
    0.81
    nesday
    0.80
     wrists
    0.80
    Act Density 0.029%

    No Known Activations