Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APICircuit TracerNEWSteerSAE EvalsExportsSlackBlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    the word "you" when the model is directly addressing the user.
    gemini-2.5-flash
    .↵↵---↵↵Do you want me to:↵↵
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 1242
    The neuron primarily activates on frequently occurring words like "the" and "and" when they appear in technical or instructional contexts, often in close proximity to numbers or specialized terms.
    gemini-2.5-flash
    pattern may include the sub-steps of: comparing the
    Neuronpedia logo
    GEMMA-3-27B
    16-GEMMASCOPE-2-RES-16K
    INDEX 1
    the AI's `assistant` role, as well as the 'Chat GPT' and 'Anti GPT' labels used in the responses.
    gemini-2.5-flash
    "Do you understand?"<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵No!
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 33274
    The neuron spotlights special control‐ or header‐tokens (like the `<|start_header_id|>`, `<|end_header_id|>`, and similar markers) that delimit and label parts of the chat transcript.
    o4-mini
    "Do you understand?"<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵No!
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 33274
    detecting explicit pornographic sexual content and requests for explicit sexual material.
    gpt-5-nano
    "Do you understand?"<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵No!
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 33274
    This neuron detects personal names (proper nouns referring to people).
    o4-mini
    , Missouri.↵↵Audrey & Michael’s April
    Neuronpedia logo
    GEMMA-3-27B
    16-GEMMASCOPE-2-RES-16K
    INDEX 8
    identifying wedding-related content such as mentions of people, venues, and events.
    gpt-5-nano
    , Missouri.↵↵Audrey & Michael’s April
    Neuronpedia logo
    GEMMA-3-27B
    16-GEMMASCOPE-2-RES-16K
    INDEX 8
    The neuron is keyed to detecting web addresses (e.g. the “www” token in URLs).
    o4-mini
    &o=https://www.rabat.net
    Neuronpedia logo
    GEMMA-3-27B
    16-GEMMASCOPE-2-RES-16K
    INDEX 9
    the neuron is looking for URLs and online web addresses.
    gpt-5-nano
    &o=https://www.rabat.net
    Neuronpedia logo
    GEMMA-3-27B
    16-GEMMASCOPE-2-RES-16K
    INDEX 9
    identifying and quantifying the overlap between upper and lower layer patterns.
    gpt-5-nano
    pattern may include the sub-steps of: comparing the
    Neuronpedia logo
    GEMMA-3-27B
    16-GEMMASCOPE-2-RES-16K
    INDEX 1
    programming code blocks and structure keywords across various programming languages.
    claude-4-5-haiku
    Node* head;↵↵ LinkedList() {↵ head =
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 88097
    detailed, substantive, information-rich prose with specific facts, technical terminology, and concrete examples.
    claude-4-5-haiku
    " (GPRVS), was adopted by lapar
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 75837
    detailed informational and explanatory content that provides substantive descriptions or analysis of a topic.
    claude-4-5-haiku
    requires Optifine to function properly and it’s also
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 72546
    tokens that are part of an AI assistant's generated response content.
    claude-4-5-haiku
    ↵<|im_start|>assistant↵There are several different ways to convert
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 24708
    words indicating whether a technical solution works or successfully solves a problem.
    claude-4-5-haiku
    that the average is calculated correctly in one case, but
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 110053
    technical or specialized terminology and detailed descriptions of complex systems.
    claude-4-5-haiku
    become vulnerable to traps and trap enchantments↵6.
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 59350
    suggestions or recommendations for what someone should do or consider.
    claude-4-5-haiku
    of date. Please consider creating a new thread.↵↵I
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 48541
    critical assessments of problems, uncertainties, or negative outcomes.
    claude-4-5-haiku
    word was said at the time of the Uggla
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 31233
    prescriptive medical or health advice and instructions on what someone should do or take.
    claude-4-5-haiku
    in children. I would begin with 10 drops
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 71584
    instructions about how to analyze, process, or structure responses to user queries.
    claude-4-5-haiku
    between letters of the word davidjl<|im_end|>↵<|im_start|>assistant↵
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 130789