INDEX
Explanations
mentions of violence in various contexts
occurrences of the word "violence" and related phrases
New Auto-Interp
Negative Logits
sonian
-0.84
ocular
-0.82
arton
-0.80
dit
-0.76
ramer
-0.71
osition
-0.71
oplan
-0.71
é¾įå
-0.68
itions
-0.68
odes
-0.66
POSITIVE LOGITS
perpetrated
1.12
inflicted
0.97
fighting
0.87
against
0.85
violence
0.84
quit
0.81
directed
0.78
Viol
0.77
prevention
0.77
erupted
0.77
Activations Density 0.052%