INDEX
Explanations
numerical values, particularly those indicating dates or quantities
dates and numbers
New Auto-Interp
Negative Logits
with
-0.28
while
-0.27
instead
-0.26
although
-0.25
by
-0.23
rinha
-0.23
Vorschlag
-0.23
so
-0.21
zero
-0.20
most
-0.20
POSITIVE LOGITS
'\\;'
0.88
<unused74>
0.80
<unused14>
0.80
[@BOS@]
0.80
<unused52>
0.80
<unused41>
0.80
<unused43>
0.80
<unused42>
0.80
<unused16>
0.80
<unused28>
0.80
Activations Density 0.031%