INDEX
Explanations
strongly negative adjectives
negative assessments or critiques, particularly the word "terrible."
New Auto-Interp
Negative Logits
ership
-0.94
pai
-0.94
ilus
-0.83
ovember
-0.78
illary
-0.78
cript
-0.77
arger
-0.76
adr
-0.76
ioch
-0.75
RAFT
-0.75
POSITIVE LOGITS
nightmare
0.88
earthqu
0.88
awful
0.87
havoc
0.87
horrible
0.85
nightmares
0.83
terrible
0.83
sounding
0.82
headache
0.81
conflic
0.77
Activations Density 0.012%