INDEX
    Explanations

    phrases related to intense negative actions or situations

    descriptors that convey extreme negativity or violence

    New Auto-Interp
    Negative Logits
    sembly
    -0.71
     Bundle
    -0.70
    ourced
    -0.70
    cession
    -0.69
    ools
    -0.69
    ITNESS
    -0.69
    ploma
    -0.69
    FU
    -0.67
    OPLE
    -0.66
    pty
    -0.66
    POSITIVE LOGITS
     retribution
    0.98
     merciless
    0.97
     punishments
    0.94
     honesty
    0.92
     retaliation
    0.92
     repression
    0.90
     Slaughter
    0.88
     punishment
    0.87
     assault
    0.87
     unfor
    0.87
    Act Density 0.108%

    No Known Activations