INDEX
Explanations
terms and phrases related to relational connections or comparisons in structured data
but, however, while
New Auto-Interp
Negative Logits
<unused41>
-0.76
[@BOS@]
-0.76
<unused43>
-0.76
<unused74>
-0.76
<unused52>
-0.76
<unused42>
-0.75
<unused68>
-0.75
<unused28>
-0.75
<unused8>
-0.75
<unused14>
-0.75
POSITIVE LOGITS
but
0.47
only
0.39
regardless
0.37
then
0.36
and
0.33
inderdaad
0.32
但不
0.31
indeed
0.31
without
0.30
However
0.30
Activations Density 0.073%