INDEX
Explanations
words related to negative outcomes or feelings like disappointments, annoyances, and embarrassments
words related to negative reactions or disappointments
New Auto-Interp
Negative Logits
$$$$
-0.69
Archangel
-0.66
bonding
-0.66
Wedding
-0.64
wrist
-0.63
extraction
-0.63
swear
-0.63
©¶æ
-0.62
skiing
-0.61
ouf
-0.60
POSITIVE LOGITS
ingly
1.45
ments
1.27
disappoint
1.14
ances
0.98
ters
0.97
terness
0.95
ings
0.92
tered
0.85
cha
0.81
rities
0.80
Activations Density 0.021%