INDEX
Explanations
references to violence and crime incidents
violence, war, and riots
violence and unrest
New Auto-Interp
Negative Logits
itattu
-0.52
eynman
-0.51
diag
-0.49
magnetization
-0.49
bootstra
-0.48
benefic
-0.47
ztés
-0.47
aguya
-0.45
ειτουργ
-0.45
roba
-0.45
POSITIVE LOGITS
violence
1.44
Violence
1.21
Violence
1.16
violence
1.12
violent
1.11
incidents
1.07
violencia
1.02
riots
0.97
violência
0.97
violen
0.96
Activations Density 0.270%