INDEX
    Explanations

    actions involving physical violence

    New Auto-Interp
    Negative Logits
    lio
    -0.86
    ublic
    -0.76
    ablishment
    -0.75
    iors
    -0.71
    Angelo
    -0.71
    UX
    -0.68
    ylum
    -0.67
    bh
    -0.67
     greets
    -0.67
    Iss
    -0.65
    POSITIVE LOGITS
     fingert
    1.18
     scissors
    1.16
     fists
    1.06
     fingers
    1.05
     finger
    1.03
     knife
    1.03
     pencil
    1.01
     claws
    1.00
     clenched
    0.98
     fingertips
    0.97
    Act Density 0.154%

    No Known Activations