INDEX
    Explanations

    phrases related to forceful actions or interventions

    references to coercive actions and violence

    New Auto-Interp
    Negative Logits
     Prediction
    -0.71
    paragraph
    -0.70
    ancial
    -0.69
    purpose
    -0.69
    daily
    -0.68
    ership
    -0.68
    orno
    -0.67
    sal
    -0.67
     Purpose
    -0.67
    rug
    -0.64
    POSITIVE LOGITS
     forcefully
    1.13
     forcibly
    1.07
     dru
    0.83
     steril
    0.82
     kissed
    0.82
     shoved
    0.82
     awoken
    0.79
     overpowered
    0.79
     violently
    0.78
    avage
    0.77
    Act Density 0.011%

    No Known Activations