INDEX
    Explanations

    mentions of physical actions or incidents, particularly those involving physical altercations or force

    New Auto-Interp
    Negative Logits
    ONSORED
    -0.83
    Reviewer
    -0.82
    )=(
    -0.79
    theless
    -0.72
     Dispatch
    -0.68
    mus
    -0.68
    ulative
    -0.66
     simultane
    -0.66
    DOC
    -0.66
     Handling
    -0.64
    POSITIVE LOGITS
    omsday
    1.26
    herty
    1.22
    ppel
    1.21
    gging
    1.07
    ctr
    1.02
    lez
    1.00
    oms
    0.96
    ozy
    0.95
    ctors
    0.95
    pez
    0.94
    Act Density 0.041%

    No Known Activations