INDEX
    Explanations

    phrases related to conflict or violence

    phrases indicating the emergence or beginning of conflicts or disturbances

    New Auto-Interp
    Negative Logits
    antry
    -0.79
     confir
    -0.73
    opa
    -0.66
     downgrade
    -0.62
     miss
    -0.60
     misses
    -0.60
    oppy
    -0.59
     vetoed
    -0.59
     retracted
    -0.58
    osuke
    -0.57
    POSITIVE LOGITS
    stretched
    0.82
    quished
    0.75
    olate
    0.72
    flows
    0.71
    casts
    0.71
    rer
    0.70
    flow
    0.68
    Sax
    0.67
    rers
    0.66
     valves
    0.64
    Act Density 0.025%

    No Known Activations