INDEX
    Explanations

    instructions

    The neuron fires on the key “object‐and‐parameter” words in step‐by‐step instructions—terms like “module,” “new,” and “name” that label what you’re renaming or configuring.

    New Auto-Interp
    Negative Logits
     leaned
    -0.07
     Puzzle
    -0.07
     Ped
    -0.07
    dummy
    -0.07
     covid
    -0.06
     پاد
    -0.06
    MUX
    -0.06
     nin
    -0.06
     That
    -0.06
     Ts
    -0.06
    POSITIVE LOGITS
     генера
    0.07
    "/>
    ↵
    0.07
    .pick
    0.07
    ondere
    0.07
    цієн
    0.07
    .tom
    0.07
     review
    0.07
     одну
    0.06
    0.06
    #\
    0.06
    Act Density 0.202%

    No Known Activations