INDEX
Explanations
words related to violent and horrifying events or acts
New Auto-Interp
Negative Logits
pai
-0.88
pring
-0.74
inoa
-0.73
inion
-0.71
trl
-0.71
arten
-0.71
ailable
-0.71
starter
-0.70
utenant
-0.69
haw
-0.68
POSITIVE LOGITS
horrors
1.11
ordeal
1.10
atrocities
1.09
tragedies
0.95
murders
0.94
scenes
0.94
torture
0.93
atroc
0.92
tale
0.91
carnage
0.91
Activations Density 0.108%