INDEX
Explanations
questions and inquiries that express curiosity or uncertainty
New Auto-Interp
Negative Logits
976
-0.07
ãĥ³ãĥĸ
-0.07
edly
-0.07
897
-0.06
cher
-0.06
Æ¡
-0.06
ozy
-0.06
rish
-0.06
inet
-0.06
imer
-0.06
POSITIVE LOGITS
perhaps
0.08
maybe
0.08
might
0.08
wonder
0.08
could
0.08
adera
0.07
possibly
0.07
Could
0.07
wouldn
0.07
èĥ½
0.07
Activations Density 0.037%