INDEX
Explanations
punctuation marks and other symbols used in text formatting or coding
New Auto-Interp
Negative Logits
myſelf
-1.02
Theſe
-0.96
་་
-0.96
Reſ
-0.93
propOrder
-0.90
doubtnut
-0.89
itſelf
-0.86
Forumite
-0.86
raiſ
-0.86
faſt
-0.86
POSITIVE LOGITS
,
0.77
!
0.66
</sup>
0.63
)
0.63
=
0.62
-
0.59
</sub>
0.58
and
0.57
</em>
0.56
</code>
0.53
Activations Density 1.235%