INDEX
Explanations
phrases or words related to positive emotions and encouragement
emojis and expressions of positive sentiment
New Auto-Interp
Negative Logits
lings
-0.90
©¶æ
-0.89
iple
-0.84
ling
-0.73
teenth
-0.71
displacement
-0.70
neighb
-0.70
disputed
-0.69
subsequ
-0.69
contracted
-0.69
POSITIVE LOGITS
:)
1.09
:-)
0.98
ðŁĺ
0.96
ðŁĻĤ
0.93
;)
0.91
Thank
0.90
Please
0.87
:(
0.87
ðŁĺ
0.84
Sorry
0.84
Activations Density 0.019%