INDEX
    Explanations

    actions and physical interactions involving violence or aggression

    New Auto-Interp
    Negative Logits
    солю
    -0.37
    EnableWeb
    -0.36
    </tfoot>
    -0.33
     kece
    -0.33
     efectiva
    -0.33
    DockStyle
    -0.32
    白い
    -0.31
    conditions
    -0.31
     baik
    -0.31
     ned
    -0.30
    POSITIVE LOGITS
     mallet
    0.55
     hammer
    0.55
     ProtoMessage
    0.52
     bờ
    0.50
    StructEnd
    0.48
    oneofs
    0.48
     hammers
    0.47
     Bewußt
    0.47
     EconPapers
    0.45
     viață
    0.44
    Act Density 0.166%

    No Known Activations