INDEX
Explanations
aggressive physical actions or altercations
violent actions and related physical confrontations
New Auto-Interp
Negative Logits
horm
-0.70
erella
-0.69
Tycoon
-0.69
ahime
-0.67
vana
-0.66
inventoryQuantity
-0.66
osate
-0.66
stellar
-0.65
hower
-0.64
Maiden
-0.64
POSITIVE LOGITS
abusive
0.94
protestors
0.92
bystanders
0.88
protesters
0.88
verbally
0.81
handcuffed
0.81
altercation
0.81
protester
0.79
disrespectful
0.79
assaulting
0.78
Activations Density 0.345%