INDEX
Explanations
expressions conveying frustration or dissatisfaction
New Auto-Interp
Negative Logits
pedia
-0.74
ourage
-0.73
afort
-0.72
eele
-0.70
osen
-0.70
odiac
-0.69
rium
-0.68
enge
-0.66
obook
-0.66
soDeliveryDate
-0.66
POSITIVE LOGITS
seeing
1.25
wasting
1.14
hearing
1.07
dealing
0.93
waiting
0.92
having
0.91
watching
0.90
being
0.90
complaining
0.90
losing
0.90
Activations Density 0.181%