INDEX
Explanations
mentions of physical injuries or risk of harm
references to injuries in various contexts
New Auto-Interp
Negative Logits
zees
-0.84
zee
-0.83
tz
-0.78
zsche
-0.77
arten
-0.74
estone
-0.68
ogle
-0.68
soDeliveryDate
-0.67
isse
-0.67
ramid
-0.65
POSITIVE LOGITS
inflicted
1.14
incurred
1.00
sustained
0.91
inflic
0.85
suffered
0.84
injuries
0.84
Survivors
0.80
letal
0.78
injury
0.76
Injury
0.73
Activations Density 0.029%