INDEX
Explanations
words related to death or killing
instances of people being killed
New Auto-Interp
Negative Logits
wcsstore
-0.81
arity
-0.80
issance
-0.70
issue
-0.68
worldly
-0.67
Plex
-0.66
Collider
-0.65
rium
-0.64
yrinth
-0.64
arist
-0.61
POSITIVE LOGITS
spree
0.92
slain
0.75
locked
0.74
bystand
0.72
nsics
0.72
switch
0.72
Massacre
0.72
rampage
0.70
linger
0.70
bystanders
0.69
Activations Density 0.031%