INDEX
Explanations
mathematical notation and variables
New Auto-Interp
Negative Logits
0.76
↳
0.74
)+"
0.74
➤
0.71
¶
0.70
►
0.70
)+'
0.70
0.68
0.68
0.67
POSITIVE LOGITS
\
1.77
(\
1.45
\,
1.30
^{\1.28
[\
1.20
$,
1.20
(\
1.19
-\
1.18
\;
1.15
{\1.15
Activations Density 0.276%