INDEX
    Explanations

    references to violence or physical confrontation

    New Auto-Interp
    Negative Logits
    солю
    -0.45
    ConstraintMaker
    -0.44
    AnchorStyles
    -0.33
    +#+#
    -0.32
     muualla
    -0.32
    TryDecodeAsNil
    -0.32
    ETTE
    -0.31
    -0.31
    EnableWeb
    -0.31
     Lip
    -0.31
    POSITIVE LOGITS
     hammer
    0.93
     hammers
    0.79
    hammer
    0.73
     mallet
    0.72
     hammering
    0.66
    Hammer
    0.65
     Hammer
    0.61
     hitting
    0.60
     banging
    0.57
     hammered
    0.57
    Act Density 0.288%

    No Known Activations