INDEX
Explanations
percentage symbols and related mathematical expressions
New Auto-Interp
Negative Logits
<sup>
-0.90
↵↵
-0.83
<b>
-0.82
<eos>
-0.81
<strong>
-0.74
£
-0.72
↵↵↵
-0.68
’
-0.67
-0.66
—
-0.64
POSITIVE LOGITS
\#
2.19
\%
2.18
\#
2.07
\%)
1.95
\%,
1.94
$\{$1.82
\$
1.78
\&
1.76
\_
1.71
\_
1.70
Activations Density 0.563%