INDEX
    Explanations

    references to violence or actions involving physical confrontations

    New Auto-Interp
    Negative Logits
    ModelSerializer
    -0.56
    chero
    -0.52
     Serializer
    -0.49
     jari
    -0.47
     Filtration
    -0.46
    hdashline
    -0.46
    gitto
    -0.46
    tieren
    -0.45
     tenu
    -0.45
     maíz
    -0.45
    POSITIVE LOGITS
     hitting
    1.69
     hits
    1.49
     hit
    1.49
     strikes
    1.49
     strike
    1.49
     struck
    1.40
    hitting
    1.39
     striking
    1.33
     Hit
    1.31
    Hit
    1.28
    Act Density 0.272%

    No Known Activations