INDEX
    Explanations

    Words related to negative behavior or treatment

    terms related to cruelty and inhumanity

    New Auto-Interp
    Negative Logits
     TD
    -0.78
    Met
    -0.75
     Cit
    -0.75
     Amar
    -0.74
     MET
    -0.72
    ERT
    -0.70
     Kislyak
    -0.70
    mint
    -0.70
     Jarrett
    -0.69
     MP
    -0.69
    POSITIVE LOGITS
     cruel
    3.12
     cruelty
    3.01
     Cruel
    2.55
    humane
    2.12
     humane
    2.01
     inhuman
    1.76
     barbaric
    1.68
    cru
    1.61
     compassionate
    1.54
     brutality
    1.47
    Act Density 0.029%

    No Known Activations