INDEX
    Explanations

    mentions of loss or endangerment of human lives

    references to the concept of lives being at risk or lost

    New Auto-Interp
    Negative Logits
    ractive
    -0.67
    phabet
    -0.66
    CAST
    -0.65
    gomery
    -0.64
    atchewan
    -0.63
    NetMessage
    -0.63
    orney
    -0.62
    agger
    -0.61
    ority
    -0.60
     Mack
    -0.60
    POSITIVE LOGITS
    lihood
    0.97
    chool
    0.82
    mares
    0.80
    guards
    0.78
    journal
    0.77
     Forever
    0.76
    behind
    0.74
     lives
    0.72
    liness
    0.72
    ously
    0.72
    Act Density 0.014%

    No Known Activations