INDEX
    Explanations

    words related to violent actions and resulting harm or danger

    language that indicates harm, danger, or serious physical injury

    New Auto-Interp
    Negative Logits
     Compass
    -0.78
     Sprite
    -0.70
     cylinders
    -0.68
    audi
    -0.67
     Leaders
    -0.66
     quotas
    -0.66
    \":
    -0.66
    soDeliveryDate
    -0.64
    clus
    -0.63
    ellen
    -0.63
    POSITIVE LOGITS
     harm
    1.65
     injury
    1.58
     bodily
    1.38
     injuries
    1.35
     anguish
    1.33
     griev
    1.31
     harms
    1.30
     inconvenience
    1.29
     damage
    1.28
    damage
    1.26
    Act Density 0.337%

    No Known Activations