INDEX
Explanations
words and phrases related to satisfaction and dissatisfaction
New Auto-Interp
Negative Logits
izzo
-0.16
oping
-0.15
NotAllowed
-0.15
uel
-0.14
sdale
-0.14
uzu
-0.14
füh
-0.14
ediÄŁi
-0.14
mage
-0.14
elop
-0.14
POSITIVE LOGITS
ment
0.23
ably
0.23
ingly
0.22
Satisfaction
0.20
度
0.19
satisfaction
0.18
atisfaction
0.18
ments
0.17
/content
0.17
iable
0.16
Activations Density 0.030%