INDEX
    Explanations

    This neuron detects placeholder instruction fragments—specifically the “[ insert … here ]”‐style tokens used to mark where a substitution should go.

    New Auto-Interp
    Negative Logits
    appeared
    -0.08
     HOH
    -0.07
    antidad
    -0.07
    exus
    -0.06
    ethe
    -0.06
    LES
    -0.06
     Payne
    -0.06
    ばかり
    -0.06
    too
    -0.06
    ázev
    -0.06
    POSITIVE LOGITS
     Viewer
    0.07
     trứng
    0.06
     quad
    0.06
    	packet
    0.06
    ,col
    0.06
    -->
    ↵
    0.06
    тесь
    0.06
     grammar
    0.06
    exclude
    0.06
     converged
    0.06
    Act Density 0.012%

    No Known Activations