INDEX
    Explanations

    syntactical elements and structure in programming code

    New Auto-Interp
    Negative Logits
    <unused79>
    -1.37
    <unused52>
    -1.37
    <unused16>
    -1.37
    [@BOS@]
    -1.36
    <unused51>
    -1.36
    <unused23>
    -1.36
    <unused41>
    -1.36
    <unused28>
    -1.36
    <unused3>
    -1.36
    <unused14>
    -1.36
    POSITIVE LOGITS
    1.33
    ,
    1.03
     (
    0.90
    ↵↵
    0.86
    .
    0.85
    0.82
     "
    0.81
     '
    0.80
      
    0.78
     A
    0.76
    Act Density 0.934%

    No Known Activations