Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APIAssistant AxisNEWCircuit TracerNEWSteerSAE EvalsExports Community BlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    Words that express strong negative impact, danger, or sensational severity (e.g., disaster/havoc/doom-type terms).
    gpt-5-mini
    pathogen that wreaks havoc in the livestock industry of
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 87140
    It detects headings, titles, or other section-start/heading tokens that mark the start of a new block or prominent label.
    gpt-5-mini
    Life is noisy and confusing↵There is so much going
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 25109
    This neuron detects questions—tokens and turns that are part of user (or conversational) interrogative utterances.
    gpt-5-mini
    to happen soon?<|im_end|>↵<|im_start|>assistant↵As an
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 61872
    Tokens marking the assistant's reply (the assistant role / assistant message starts).
    gpt-5-mini
    <|im_end|>↵<|im_start|>assistant↵Claro! Aqui
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 85710
    the neuron detects numeric tokens, especially years and other multi-digit dates/numbers.
    gpt-5-mini
    9, 2010. The episode was
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 92380
    tokens that belong to the assistant's generated message or message/metadata markers (i.e., assistant-role and model-generated content).
    gpt-5-mini
    or update.↵↵As of my knowledge cutoff in September
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 41407
    Tokens that mark the assistant's turn/start of an assistant response (assistant-turn boundary).
    gpt-5-mini
    Barcelona to Moscow<|im_end|>↵<|im_start|>assistant↵The quickest and
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 75599
    tokens that are part of the user's input (i.e., user-role prompt text).
    gpt-5-mini
    breast expansion story.<|im_end|>↵<|im_start|>assistant↵Once upon
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 1882
    This neuron detects assistant-generated text (tokens marking the assistant's responses).
    gpt-5-mini
    ?<|im_end|>↵<|im_start|>assistant↵Most matters that are commonly
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 21618
    References to holidays, festivals, or seasonal/celebration-related terms (names of holidays, festival events, and related date/time words).
    gpt-5-mini
    by again tomorrow.↵↵April Fool’s Day↵The ultimate
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 23022
    tokens that mark or occur inside the assistant's replies (the assistant speaker/response segments).
    gpt-5-mini
    Hi llama<|im_end|>↵<|im_start|>assistant↵Hello there! I
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 25757
    mentions of Asimov's Laws of Robotics (references to the Three/Four Laws, "Laws", "robot", or "robotics").
    gpt-5-mini
    Universum von Isaac Asimov. Die Geset
    Neuronpedia logo
    QWEN2.5-7B-IT
    11-RESID-POST-AA
    INDEX 30068
    contrasting statements using "but" to introduce unexpected or opposing ideas.
    claude-4-5-haiku
     F1 drivers, but they were grand prix driver none
    Neuronpedia logo
    GEMMA-2-9B
    31-GEMMASCOPE-RES-16K
    INDEX 16383
    time-related phrases describing delivery windows and scheduling, particularly patterns involving days, times, and frequency of operations.
    claude-4-5-haiku
     dispatched the same or the next working day.↵↵Comment
    Neuronpedia logo
    GEMMA-2-9B
    31-GEMMASCOPE-RES-16K
    INDEX 16382
    abstract quantitative relationships and associations between variables.
    claude-4-5-haiku
           2↵         Some confusion exists in the
    Neuronpedia logo
    GEMMA-2-9B
    31-GEMMASCOPE-RES-16K
    INDEX 16381
    technical and specialized vocabulary from academic and professional domains.
    claude-4-5-haiku
     and specialisation. For emergency cases the rescue station has
    Neuronpedia logo
    GEMMA-2-9B
    31-GEMMASCOPE-RES-16K
    INDEX 16380
    calls-to-action and contact information phrases.
    claude-4-5-haiku
     Contact Tony Inman via this website for a chat about
    Neuronpedia logo
    GEMMA-2-9B
    31-GEMMASCOPE-RES-16K
    INDEX 16379
    baseball game summaries and sports statistics.
    claude-4-5-haiku
     extra innings by getting Juan Francisco to pop up on the
    Neuronpedia logo
    GEMMA-2-9B
    31-GEMMASCOPE-RES-16K
    INDEX 16378
    opening square brackets followed by text content.
    claude-4-5-haiku
     “a claim of th[e↵same] order
    Neuronpedia logo
    GEMMA-2-9B
    31-GEMMASCOPE-RES-16K
    INDEX 16377
    dialogue and conversational exchanges between characters.
    claude-4-5-haiku
     you were speaking to me." He had an accent which
    Neuronpedia logo
    GEMMA-2-9B
    31-GEMMASCOPE-RES-16K
    INDEX 16376