INDEX
    Explanations

    words related to threatening behavior

    phrases related to threats, particularly those of violence or intimidation

    New Auto-Interp
    Negative Logits
    çĦ
    -0.84
    arist
    -0.80
    urgy
    -0.73
    coat
    -0.71
    puted
    -0.71
    mys
    -0.70
     Balanced
    -0.70
    bits
    -0.69
    cise
    -0.69
    éĸ
    -0.68
    POSITIVE LOGITS
     threats
    0.85
     posed
    0.81
     warnings
    0.80
     threatening
    0.78
     posters
    0.75
     threat
    0.74
     intimidation
    0.74
     hotline
    0.74
     threatened
    0.73
     leveled
    0.72
    Act Density 0.023%

    No Known Activations