INDEX
Explanations
questioning expressions or prompts
rhetorical questions and expressions of uncertainty
New Auto-Interp
Negative Logits
shaw
-0.76
icro
-0.72
aper
-0.71
ania
-0.69
achy
-0.68
undo
-0.68
arers
-0.67
arer
-0.67
opsis
-0.65
eper
-0.65
POSITIVE LOGITS
.?
0.98
???
0.89
����
0.88
?,
0.86
Huh
0.81
Nope
0.77
soever
0.77
Interest
0.74
?:
0.73
Nationwide
0.69
Activations Density 0.032%