INDEX
    Explanations

    This neuron never actually activates on any token—i.e. it’s effectively “dead” and doesn’t detect any specific pattern.

    New Auto-Interp
    Negative Logits
    OTH
    -0.07
    —"
    -0.07
    cannot
    -0.07
    、「
    -0.07
    ();
    ↵
    ↵
    ↵
    -0.06
     mentally
    -0.06
     Born
    -0.06
    comment
    -0.06
    ồng
    -0.06
     volunteered
    -0.06
    POSITIVE LOGITS
    0.07
     Tage
    0.07
     lavender
    0.06
     './../
    0.06
     ir
    0.06
     vår
    0.06
     дерева
    0.06
    olicited
    0.06
     republik
    0.06
    Backdrop
    0.06
    Act Density 0.005%

    No Known Activations