INDEX
Explanations
reports of negative events or incidents related to accidents, shootings, attacks, or incidents resulting in harm or casualties
instances of incidents or events that involve violence or tragedy
New Auto-Interp
Negative Logits
bern
-0.79
peak
-0.73
representations
-0.69
persuasion
-0.67
endorsements
-0.65
notation
-0.65
unts
-0.64
introductory
-0.64
'/
-0.64
doms
-0.63
POSITIVE LOGITS
occurred
1.53
caused
1.41
resulted
1.37
happened
1.25
sparked
1.24
transpired
1.15
rocked
1.13
prompted
1.13
erupted
1.13
worsened
1.11
Activations Density 0.270%