INDEX
    Explanations

    references to violence in various contexts

    New Auto-Interp
    Negative Logits
    fjspx
    -0.67
    ContentAlignment
    -0.66
     مرئيه
    -0.59
    брь
    -0.56
     pocz
    -0.54
    NOPQRST
    -0.52
     Heu
    -0.52
     wą
    -0.52
     Revival
    -0.52
     tre
    -0.51
    POSITIVE LOGITS
     violence
    2.45
     Violence
    2.22
    violence
    2.18
    Violence
    2.13
     violent
    2.07
    violent
    1.94
    Violent
    1.90
     Violent
    1.83
     violencia
    1.74
     violen
    1.71
    Act Density 0.127%

    No Known Activations