INDEX
    Explanations

    The neuron fires on structural/control tokens (the metadata and boundary markers surrounding the user/system/instruction blocks).

    New Auto-Interp
    Negative Logits
     också
    -0.07
    erosis
    -0.06
    _many
    -0.06
     weary
    -0.06
     ours
    -0.06
     Put
    -0.06
    /bus
    -0.06
    /fonts
    -0.06
     PROF
    -0.06
    .students
    -0.06
    POSITIVE LOGITS
    lock
    0.08
     seb
    0.08
     DNS
    0.07
    DNS
    0.07
     jam
    0.07
     DeepCopy
    0.07
    는지
    0.07
    _GUID
    0.06
     extensions
    0.06
    ns
    0.06
    Act Density 0.068%

    No Known Activations