INDEX
    Explanations

    code-related syntax elements

    New Auto-Interp
    Negative Logits
    AndEndTag
    -0.94
    <unused68>
    -0.93
    <unused17>
    -0.93
    <unused14>
    -0.92
    <unused52>
    -0.92
    <unused41>
    -0.92
    <unused47>
    -0.92
    <unused42>
    -0.92
    <unused16>
    -0.92
    <unused28>
    -0.92
    POSITIVE LOGITS
    .*;
    0.68
    ;
    0.59
    .*;
    
    0.56
    );
    0.46
    ();
    0.46
    ::*;
    0.46
    ];
    0.45
    ;
    
    0.42
    .
    0.41
    ';
    0.41
    Act Density 0.002%

    No Known Activations