INDEX
Explanations
instances of violent actions such as shootings, attacks, stabbings, and bombings
instances of violence or attacks
New Auto-Interp
Negative Logits
ennial
-0.84
aster
-0.78
ripp
-0.73
cancer
-0.69
ãĤ´ãĥ³
-0.69
ç¥ŀ
-0.68
onyms
-0.66
alog
-0.65
ANN
-0.64
eric
-0.64
POSITIVE LOGITS
unarmed
0.91
parked
0.91
bystanders
0.86
fleeing
0.85
bystand
0.82
attempting
0.79
parach
0.79
bicy
0.79
sleeping
0.73
suspected
0.72
Activations Density 0.434%