INDEX
Explanations
strongly negative or unpleasant adjectives
expressions related to negative emotions and suffering
New Auto-Interp
Negative Logits
icipated
-0.94
encers
-0.87
encer
-0.78
reek
-0.76
RAFT
-0.76
prints
-0.74
lean
-0.74
itives
-0.74
MpServer
-0.73
ioch
-0.73
POSITIVE LOGITS
miserable
0.98
wretched
0.81
misery
0.79
dismal
0.78
lonely
0.78
nightmare
0.78
pathetic
0.75
motel
0.71
depressing
0.69
miser
0.69
Activations Density 0.022%