INDEX
    Explanations

    words related to causing harm or pain

    terms associated with causing harm or injury

    New Auto-Interp
    Negative Logits
     Ou
    -0.86
    wagen
    -0.84
    runner
    -0.79
    cube
    -0.74
    cius
    -0.73
    clinton
    -0.65
    elling
    -0.65
    chrom
    -0.64
     McKenna
    -0.64
     Monaco
    -0.63
    POSITIVE LOGITS
     inflicted
    1.01
     inflic
    0.91
     wounds
    0.88
     inflicting
    0.87
     inflict
    0.85
    hesda
    0.85
     havoc
    0.85
    lehem
    0.81
     veter
    0.80
    olon
    0.80
    Act Density 0.023%

    No Known Activations