INDEX
Explanations
the beginning markers or indicators of new sections in the text
New Auto-Interp
Negative Logits
-0.69
(
-0.66
I
-0.58
what
-0.55
(
-0.53
someone
-0.52
pretty
-0.51
may
-0.51
:
-0.50
*
-0.50
POSITIVE LOGITS
IsContent
1.07
estekak
1.05
########.
1.04
脚注の使い方
1.03
betweenstory
1.01
StructEnd
1.00
nahilalakip
0.99
parsedMessage
0.98
__":
0.95
Бахар
0.94
Activations Density 0.075%