INDEX
Explanations
expressions of gratitude and acknowledgment in dialogue
New Auto-Interp
Negative Logits
beck
-0.16
ull
-0.15
íİ
-0.15
wers
-0.15
573
-0.15
.dtd
-0.15
ille
-0.15
amp
-0.15
roid
-0.15
lice
-0.15
POSITIVE LOGITS
Leaving
0.18
leaving
0.17
yleft
0.15
leave
0.15
LTRB
0.15
SED
0.15
amerate
0.15
togroup
0.15
leave
0.15
kker
0.14
Activations Density 0.278%