INDEX
Explanations
punctuation marks and sentence boundaries
numerical quantities and units
New Auto-Interp
Negative Logits
briefly
-0.34
Volkes
-0.31
neither
-0.31
y
-0.31
temporarily
-0.31
-0.30
blindly
-0.30
both
-0.29
↵
-0.29
fully
-0.28
POSITIVE LOGITS
ſind
0.79
linawan
0.79
encodeWith
0.77
IntoConstraints
0.77
<unused8>
0.76
[@BOS@]
0.76
<unused16>
0.76
<unused74>
0.76
<unused51>
0.76
<unused14>
0.75
Activations Density 0.117%