INDEX
Explanations
references to political events, assassination attempts, and related themes
New Auto-Interp
Negative Logits
Cola
-0.80
bos
-0.77
Refresh
-0.73
Ocean
-0.72
Va
-0.70
TAG
-0.69
Forest
-0.69
Loading
-0.68
Alpha
-0.68
NES
-0.67
POSITIVE LOGITS
assass
1.14
assassination
1.14
assassinated
1.10
assassins
1.00
assassin
0.99
Assass
0.96
assassinate
0.93
slit
0.86
spree
0.81
attempt
0.79
Activations Density 0.041%