INDEX
    Explanations

    actions related to violence and physical confrontations

    New Auto-Interp
    Negative Logits
    -0.56
     spesa
    -0.54
    experiment
    -0.53
     C
    -0.52
    urity
    -0.52
     Sche
    -0.51
    Flux
    -0.51
    alej
    -0.49
     Flux
    -0.49
     Bill
    -0.49
    POSITIVE LOGITS
     hitting
    1.35
     strike
    1.31
     hit
    1.30
     strikes
    1.28
     Schlag
    1.25
     punch
    1.25
     hammer
    1.25
     punches
    1.24
     blows
    1.22
     hits
    1.21
    Act Density 0.266%

    No Known Activations