INDEX
Explanations
strong emotional reactions such as being saddened, disappointed, appalled, horrified, or shocked
expressions of strong negative emotions, particularly sadness, disappointment, and horror
New Auto-Interp
Negative Logits
livest
-0.69
cheat
-0.68
Ranked
-0.67
enture
-0.67
soDeliveryDate
-0.66
pmwiki
-0.66
dated
-0.65
checks
-0.63
sneak
-0.63
sites
-0.62
POSITIVE LOGITS
aback
0.85
vier
0.74
saddened
0.73
dy
0.72
urous
0.69
citiz
0.66
enough
0.66
ORK
0.65
seeing
0.64
ãĤ¦ãĤ¹
0.64
Activations Density 0.086%