INDEX
    Explanations

    percentage symbols and related mathematical expressions

    New Auto-Interp
    Negative Logits
    <sup>
    -0.90
    ↵↵
    -0.83
    <b>
    -0.82
    <eos>
    -0.81
    <strong>
    -0.74
     £
    -0.72
    ↵↵↵
    -0.68
    -0.67
        
    -0.66
    -0.64
    POSITIVE LOGITS
    \#
    2.19
    \%
    2.18
     \#
    2.07
    \%)
    1.95
    \%,
    1.94
     $\{$
    1.82
    \$
    1.78
    \&
    1.76
    \_
    1.71
     \_
    1.70
    Act Density 0.563%

    No Known Activations