INDEX
Explanations
phrases related to violent actions, especially targeted at prominent figures
instances of the word "assassination" and related terms
New Auto-Interp
Negative Logits
bos
-0.85
orders
-0.73
rl
-0.71
hist
-0.70
ulia
-0.69
Commerce
-0.68
fm
-0.67
Toys
-0.66
È
-0.66
Seah
-0.65
POSITIVE LOGITS
assassination
1.00
assassinated
0.98
assass
0.98
assassins
0.97
assassin
0.93
assassinate
0.90
Assass
0.87
slit
0.86
plots
0.74
satir
0.73
Activations Density 0.078%