INDEX
    Explanations

    words related to violence and crime

    references to violent behavior or incidents

    New Auto-Interp
    Negative Logits
    ledged
    -0.85
    ĸļ
    -0.83
    è£ħ
    -0.80
    _>
    -0.77
    nai
    -0.76
    illy
    -0.76
    hart
    -0.74
    ffer
    -0.70
    cellent
    -0.69
    ilege
    -0.69
    POSITIVE LOGITS
     extremism
    1.03
     clashes
    1.01
     outburst
    0.99
    acre
    0.97
     crime
    0.94
     confrontation
    0.93
     tendencies
    0.93
     jihad
    0.91
     extremists
    0.91
     altercation
    0.89
    Act Density 0.068%

    No Known Activations