INDEX
    Explanations

    instances of physical restraint or aggression

    New Auto-Interp
    Negative Logits
    wts
    -0.56
     виправивши
    -0.56
     Single
    -0.55
     single
    -0.54
    уза
    -0.54
     kasarigan
    -0.51
    Single
    -0.51
    -0.50
    ervazione
    -0.50
    single
    -0.49
    POSITIVE LOGITS
    ANTLR
    0.74
     hugs
    0.74
     cuddle
    0.73
    RectangleBorder
    0.73
     kisses
    0.72
     wrestling
    0.72
     wrestle
    0.72
    restling
    0.69
     hugging
    0.69
     abraço
    0.68
    Act Density 0.260%

    No Known Activations