INDEX
Explanations
question marks and related punctuation within a dialogue or inquiry context
New Auto-Interp
Negative Logits
-0.70
tay
-0.59
.
-0.59
(
-0.52
↵
-0.50
Big
-0.47
a
-0.47
↵↵
-0.47
,
-0.46
S
-0.46
POSITIVE LOGITS
'?'
1.15
!?
1.09
="?
1.08
?'
1.07
$?
1.06
?...
1.06
?
1.04
?—
1.03
?!?
1.02
?".
1.01
Activations Density 0.506%