INDEX
Explanations
words related to violent incidents or crimes
references to violent incidents or assaults
New Auto-Interp
Negative Logits
ETA
-0.71
BuyableInstoreAndOnline
-0.67
dit
-0.66
FORMATION
-0.66
theless
-0.64
}}}
-0.63
UTION
-0.62
mineral
-0.62
ERE
-0.61
SOURCE
-0.61
POSITIVE LOGITS
spree
1.05
perpetrated
0.94
attack
0.87
attack
0.87
attacks
0.86
against
0.84
abad
0.82
ivist
0.78
iveness
0.77
assault
0.76
Activations Density 0.039%