INDEX
    Explanations

    references to different kinds of threats, particularly death threats

    instances of the word "threats" related to various forms of intimidation or danger

    New Auto-Interp
    Negative Logits
    çĦ
    -0.83
    arist
    -0.81
    UX
    -0.77
    cise
    -0.76
    bred
    -0.73
    puted
    -0.73
    æ©Ł
    -0.70
    tiny
    -0.70
    NAS
    -0.70
    CSS
    -0.69
    POSITIVE LOGITS
     threats
    0.87
     threatening
    0.83
     posed
    0.79
     against
    0.78
     retaliation
    0.78
     repr
    0.77
     intimidation
    0.75
     threatened
    0.74
     leveled
    0.72
     hotline
    0.72
    Act Density 0.030%

    No Known Activations