INDEX
Explanations
dialogue or conversation phrases
conversational phrases and expressions of informal dialogue
New Auto-Interp
Negative Logits
ulz
-0.79
iple
-0.71
Versions
-0.69
eatures
-0.69
etary
-0.68
heast
-0.66
olves
-0.65
Īè
-0.65
asters
-0.65
ensed
-0.62
POSITIVE LOGITS
yeah
0.97
uh
0.88
oh
0.83
sir
0.78
hey
0.78
yes
0.78
maybe
0.75
wow
0.74
um
0.73
huh
0.71
Activations Density 0.192%