INDEX
Explanations
occurrences of violent actions and injuries
New Auto-Interp
Negative Logits
Sword
-0.16
bombed
-0.15
ROL
-0.15
dden
-0.15
rubbing
-0.15
guns
-0.15
Guns
-0.15
Rub
-0.15
tart
-0.14
ault
-0.14
POSITIVE LOGITS
wounds
0.23
perfor
0.22
misses
0.22
grazing
0.21
wound
0.21
fired
0.21
rico
0.20
bullet
0.20
fatally
0.20
aimed
0.20
Activations Density 0.096%