INDEX
Explanations
expressions of gratitude and positive sentiments in conversations
New Auto-Interp
Negative Logits
boo
-0.16
ie
-0.16
ivia
-0.15
uum
-0.15
bore
-0.14
ides
-0.14
_sound
-0.14
ulares
-0.14
ombre
-0.14
ces
-0.13
POSITIVE LOGITS
pleasure
0.24
nice
0.20
Nice
0.19
privilege
0.18
nic
0.18
Nice
0.17
Ple
0.17
nice
0.17
prive
0.17
#ad
0.17
Activations Density 0.128%