INDEX
Explanations
conversational expressions related to social interactions and evaluations
New Auto-Interp
Negative Logits
меÑĩ
-0.15
Ìģ
-0.15
rc
-0.13
è¾ĥ
-0.13
ymoon
-0.13
Ñıв
-0.13
å¡ļ
-0.12
ثابت
-0.12
IENTATION
-0.12
INY
-0.12
POSITIVE LOGITS
yeah
0.15
apos
0.15
Yeah
0.14
mr
0.14
ÃŃses
0.14
And
0.13
315
0.13
377
0.13
yes
0.13
ikit
0.13
Activations Density 0.057%