INDEX
Explanations
conversational phrases related to humor and informal interactions
New Auto-Interp
Negative Logits
awy
-0.18
idor
-0.16
å·
-0.15
utors
-0.15
ouv
-0.15
ijkl
-0.14
iÄį
-0.14
:async
-0.14
meli
-0.14
ä¹ĭä¸Ģ
-0.13
POSITIVE LOGITS
folks
0.43
guys
0.40
sir
0.39
gentlemen
0.36
mate
0.36
Fol
0.36
ladies
0.35
boys
0.33
fol
0.33
dear
0.31
Activations Density 0.746%