INDEX
    Explanations

    words associated with violence and brutality

    New Auto-Interp
    Negative Logits
    ICLE
    -0.81
    manuel
    -0.72
     Leilan
    -0.70
    mberg
    -0.69
    conservancy
    -0.69
    bles
    -0.68
    iasm
    -0.68
    OPLE
    -0.68
    ource
    -0.67
    arters
    -0.66
    POSITIVE LOGITS
     punishments
    0.95
     retribution
    0.93
     beasts
    0.90
     thug
    0.90
     criminals
    0.88
     punishment
    0.88
     honesty
    0.87
     predators
    0.85
     murderers
    0.85
     thugs
    0.84
    Act Density 0.136%

    No Known Activations