INDEX
    Explanations

    code/data structure

    This neuron fires on the structured “Question:” prompt header—i.e. the tokens labeling the user’s question (like “Question:”, “the”, “input”, “question”, “you”) in the few-shot prompt format.

    New Auto-Interp
    Negative Logits
    -0.06
     Density
    -0.06
    AMP
    -0.06
    debian
    -0.06
    فو
    -0.06
    .clf
    -0.06
    Dead
    -0.06
    -0.06
     freshwater
    -0.06
    _bt
    -0.06
    POSITIVE LOGITS
    crets
    0.07
     protested
    0.06
     губер
    0.06
     rubble
    0.06
    eful
    0.06
    sess
    0.06
    ئت
    0.06
    ’ll
    0.06
    ULA
    0.06
    'll
    0.06
    Act Density 0.002%

    No Known Activations