INDEX
Explanations
phrases indicating invitations, gratitude, and proposals
New Auto-Interp
Negative Logits
ien
-0.16
jd
-0.16
oir
-0.15
_PLATFORM
-0.15
_CM
-0.14
çIJĨ
-0.14
elman
-0.14
iece
-0.14
zes
-0.14
ceed
-0.14
POSITIVE LOGITS
shiv
0.17
endon
0.15
ILES
0.15
ãĥ©ãĥ³ãĤ¹
0.15
esson
0.14
tetas
0.14
ibal
0.14
dzi
0.14
iddet
0.14
ibili
0.14
Activations Density 0.034%