INDEX
Explanations
references to criminal activities and legal charges
New Auto-Interp
Negative Logits
massac
-0.17
kills
-0.16
massacre
-0.16
executions
-0.16
shooters
-0.16
æĿĢ
-0.16
killers
-0.15
killings
-0.15
killed
-0.15
slaughtered
-0.15
POSITIVE LOGITS
punches
0.34
physical
0.31
punched
0.30
punching
0.29
punch
0.29
Physical
0.28
Physical
0.28
physically
0.28
Punch
0.28
physical
0.27
Activations Density 0.218%