INDEX
    Explanations

    patterns of curly braces or brackets

    New Auto-Interp
    Negative Logits
    ness
    -0.80
    -0.73
    ers
    -0.72
    er
    -0.69
    (
    -0.68
    ment
    -0.66
    <sup>
    -0.66
    an
    -0.66
    ating
    -0.65
    ings
    -0.64
    POSITIVE LOGITS
    "]}
    1.43
    "}
    1.43
    ']}
    1.39
     }}$}
    1.39
    ")}
    1.36
    ]")]
    1.36
    '}
    1.34
    ).}
    1.21
    .)}
    1.21
    ')}
    1.20
    Act Density 0.288%

    No Known Activations