INDEX
Explanations
a mix of introductory phrases and expressions of time or conditions
New Auto-Interp
Negative Logits
.
-0.58
;
-0.52
↵
-0.51
fsp
-0.47
<bos>
-0.47
?
-0.46
LookAnd
-0.45
stanno
-0.45
:
-0.45
::::::::
-0.42
POSITIVE LOGITS
itſelf
0.93
{},
0.84
fometimes
0.83
doubtnut
0.82
ſmall
0.80
poffible
0.79
ſeveral
0.79
myſelf
0.78
[],
0.77
$_"
0.77
Activations Density 0.308%