INDEX
    Explanations

    phrases related to physical harm or violence

    terms related to bodily harm and injuries

    New Auto-Interp
    Negative Logits
    gered
    -0.74
    rieg
    -0.73
    olini
    -0.71
    owitz
    -0.69
     Clover
    -0.69
    kers
    -0.67
    âϦ
    -0.65
    night
    -0.65
    effective
    -0.64
     Kafka
    -0.63
    POSITIVE LOGITS
     fluids
    0.97
     bodily
    0.91
    puter
    0.85
     injury
    0.79
     incorpor
    0.78
     organs
    0.77
     tissues
    0.76
    hesda
    0.76
     tradem
    0.75
     anatomy
    0.75
    Act Density 0.006%

    No Known Activations