INDEX
    Explanations

    This neuron selectively activates on the auxiliary verb “does.”

    New Auto-Interp
    Negative Logits
    /meta
    -0.06
    ownt
    -0.06
    IAN
    -0.06
     Seah
    -0.06
     Barbar
    -0.06
    Every
    -0.06
     marathon
    -0.06
     lun
    -0.06
     Marathon
    -0.06
    REAM
    -0.06
    POSITIVE LOGITS
     does
    0.13
     did
    0.10
     doesn
    0.10
    did
    0.10
    does
    0.09
     don
    0.09
     is
    0.09
     do
    0.09
    doesn
    0.09
    —is
    0.09
    Act Density 0.051%

    No Known Activations