INDEX
    Explanations

    references to violent incidents or attacks

    violence, attacks, and terrorism

    terrorism and violent attacks

    New Auto-Interp
    Negative Logits
    Personensuche
    -0.38
     suspensão
    -0.36
     suspensión
    -0.35
     Infórmanos
    -0.33
     nogal
    -0.33
     delete
    -0.32
    jspb
    -0.32
     Scherer
    -0.32
     tatuagem
    -0.32
     zru
    -0.32
    POSITIVE LOGITS
     terrorists
    0.73
     terrorist
    0.73
     terrorism
    0.72
    terror
    0.70
     terror
    0.65
    :✨
    0.64
     Terrorism
    0.63
     Terror
    0.62
     terrorismo
    0.62
     терро
    0.60
    Act Density 0.126%

    No Known Activations