INDEX
Explanations
words related to extremely negative or horrifying events
references to extreme negative events or situations
New Auto-Interp
Negative Logits
annis
-0.79
pai
-0.73
sers
-0.73
arten
-0.72
verning
-0.71
pta
-0.71
utenant
-0.69
sama
-0.69
cius
-0.68
glas
-0.68
POSITIVE LOGITS
horrors
0.86
asylum
0.84
atrocities
0.83
ally
0.82
ordeal
0.80
earthqu
0.77
omic
0.76
tales
0.75
nightmares
0.75
imagery
0.74
Activations Density 0.053%