INDEX
    Explanations

    parentheses and other grouping symbols in the text

    New Auto-Interp
    Negative Logits
    $\
    -0.87
    _
    -0.75
    ${
    -0.73
    (-
    -0.70
    [
    -0.70
    {
    -0.68
    @
    -0.68
    $-
    -0.67
    $-.
    -0.67
    $
    -0.66
    POSITIVE LOGITS
    ..)
    1.12
    ,)
    0.96
    ....)
    0.93
    …).
    0.91
     );
    0.88
    ...),
    0.86
    .),
    0.85
     )
    0.84
    ?),
    0.84
     ),
    0.83
    Act Density 0.512%

    No Known Activations