INDEX
Explanations
HTML or CSS class attributes
class followed by =
New Auto-Interp
Negative Logits
/
-0.44
,
-0.44
;
-0.38
-,
-0.35
-
-0.32
、
-0.32
.
-0.31
–
-0.31
-0.30
¬
-0.30
POSITIVE LOGITS
<unused14>
0.94
[@BOS@]
0.94
<unused42>
0.94
<unused80>
0.94
<unused51>
0.94
<unused43>
0.94
<unused52>
0.94
<unused41>
0.94
<unused8>
0.94
<unused16>
0.94
Activations Density 0.015%