INDEX
    Explanations

    instances of nested brackets or array-like structures

    New Auto-Interp
    Negative Logits
    ')}}
    -0.81
    ')))
    -0.75
    ']))
    -0.72
     ')
    
    -0.71
    ''')
    -0.71
     }))
    -0.70
    )')
    -0.68
    '})
    -0.67
    ))))))))
    -0.65
    "})
    -0.64
    POSITIVE LOGITS
    [
    1.84
     }^{[
    1.74
    ![
    1.45
    ?[
    1.45
    ^{[
    1.44
    $[
    1.42
     $[
    1.42
    _[
    1.41
    ("[
    1.41
    ()[
    1.40
    Act Density 0.159%

    No Known Activations