INDEX
Explanations
references to accidents, incidents involving harm, and violent events
occurrences of violent incidents or deaths
New Auto-Interp
Negative Logits
Pros
-0.72
hett
-0.71
}}}
-0.70
plom
-0.67
eworld
-0.64
ecd
-0.64
Stud
-0.63
SEC
-0.61
Intern
-0.61
apps
-0.61
POSITIVE LOGITS
spree
1.08
involving
0.99
occurred
0.90
perpetrated
0.87
altercation
0.84
rampage
0.83
burglary
0.79
erupted
0.76
unrelated
0.76
explosion
0.76
Activations Density 0.258%