INDEX
    Explanations

    mentions of physical damage or harm in written text

    references to damage or harm

    New Auto-Interp
    Negative Logits
    zsche
    -0.78
    zee
    -0.71
    rams
    -0.70
    ramid
    -0.66
    atorial
    -0.65
     Bars
    -0.64
    liner
    -0.63
     Pitch
    -0.62
     liner
    -0.60
    gent
    -0.60
    POSITIVE LOGITS
     inflicted
    1.14
     damage
    1.01
     mitigation
    0.97
     wrought
    0.95
    damage
    0.87
     havoc
    0.81
     damaged
    0.79
     incurred
    0.78
     damages
    0.76
     horm
    0.75
    Act Density 0.020%

    No Known Activations