INDEX
    Explanations

    words related to acts of aggression or violence

    references to attacks, particularly in a violent or aggressive context

    New Auto-Interp
    Negative Logits
    zl
    -0.65
    tz
    -0.62
    ETA
    -0.61
    atom
    -0.60
     Marketable
    -0.60
    Vert
    -0.59
     Supplementary
    -0.58
    Zip
    -0.58
    theless
    -0.57
    glomer
    -0.57
    POSITIVE LOGITS
    attack
    1.16
     attack
    1.09
     attacks
    0.96
    Attack
    0.92
     spree
    0.91
    oise
    0.87
     attackers
    0.86
    attacks
    0.86
    ocalypse
    0.83
    CVE
    0.81
    Act Density 0.031%

    No Known Activations