INDEX
Explanations
strongly negative adjectives or nouns associated with unfortunate events
intense negative descriptors related to harm or wrongdoing
New Auto-Interp
Negative Logits
pai
-0.84
icles
-0.74
ratulations
-0.72
icle
-0.71
ership
-0.71
gat
-0.71
leans
-0.70
encers
-0.69
Lean
-0.69
ilus
-0.68
POSITIVE LOGITS
atrocities
0.92
tragedies
0.89
nightmares
0.88
tragedy
0.87
injustice
0.84
ordeal
0.84
injust
0.84
horrible
0.83
awful
0.83
ally
0.83
Activations Density 0.097%