INDEX
    Explanations

    terms related to violence or aggression

    New Auto-Interp
    Negative Logits
    leaf
    -0.82
    FU
    -0.72
    ploma
    -0.71
    BU
    -0.70
    cript
    -0.70
     Unlimited
    -0.69
    hner
    -0.69
    ource
    -0.69
    OPLE
    -0.68
    Script
    -0.68
    POSITIVE LOGITS
    ized
    0.97
     assault
    0.92
     assaults
    0.87
    ified
    0.86
     retribution
    0.85
     killers
    0.84
     beasts
    0.83
     punishments
    0.83
     murdering
    0.82
    izing
    0.82
    Act Density 0.021%

    No Known Activations