Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APICircuit TracerNEWSteerSAE EvalsExportsSlackBlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    programming code blocks and structure keywords across various programming languages.
    claude-4-5-haiku
    Node* head;↵↵ LinkedList() {↵ head =
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 88097
    detailed, substantive, information-rich prose with specific facts, technical terminology, and concrete examples.
    claude-4-5-haiku
    " (GPRVS), was adopted by lapar
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 75837
    detailed informational and explanatory content that provides substantive descriptions or analysis of a topic.
    claude-4-5-haiku
    requires Optifine to function properly and it’s also
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 72546
    tokens that are part of an AI assistant's generated response content.
    claude-4-5-haiku
    ↵<|im_start|>assistant↵There are several different ways to convert
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 24708
    words indicating whether a technical solution works or successfully solves a problem.
    claude-4-5-haiku
    that the average is calculated correctly in one case, but
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 110053
    technical or specialized terminology and detailed descriptions of complex systems.
    claude-4-5-haiku
    become vulnerable to traps and trap enchantments↵6.
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 59350
    suggestions or recommendations for what someone should do or consider.
    claude-4-5-haiku
    of date. Please consider creating a new thread.↵↵I
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 48541
    critical assessments of problems, uncertainties, or negative outcomes.
    claude-4-5-haiku
    word was said at the time of the Uggla
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 31233
    prescriptive medical or health advice and instructions on what someone should do or take.
    claude-4-5-haiku
    in children. I would begin with 10 drops
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 71584
    instructions about how to analyze, process, or structure responses to user queries.
    claude-4-5-haiku
    between letters of the word davidjl<|im_end|>↵<|im_start|>assistant↵
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 130789
    attempts to jailbreak or manipulate the AI into violating its guidelines and generating inappropriate content.
    claude-4-5-haiku
    uale entro-mondamento. NAME_1 off
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 129439
    low-quality, spam, or adult content in text.
    claude-4-5-haiku
    which bring brandy did so after having fuckathon with
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 43830
    concrete nouns and key informational content words that carry semantic weight in the text.
    claude-4-5-haiku
    or provide you with more information.↵↵The marketing sector can
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 36268
    user search queries and informational requests, particularly the key terms within those queries across multiple languages.
    claude-4-5-haiku
    liste types artisanat existant époque moderne Po
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 2568
    factually incorrect or overconfident assertions presented with certainty.
    claude-4-5-haiku
    6. No credit check required↵7. Short-term
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 6997
    words or tokens related to programming, technical terms, or conversational roles within code or instruction-like contexts.
    gemini-2.5-flash
    between letters of the word davidjl<|im_end|>↵<|im_start|>assistant↵
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 130789
    References to laws, rules, statutes, regulations, and formal legal citations (including acronyms and numbered rule/section citations).
    gpt-5-mini
    ↵↵  was relevant under CRE 401 and
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 3021
    The neuron detects mentions of alternatives, substitutes, or replacement concepts — when something is presented as an alternative or being replaced.
    gpt-5-mini
    , claims of inadequate chemical substitutes, difficulty in getting industri
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 3006
    The neuron detects named entities — proper nouns like people, organizations, places, and dates.
    gpt-5-mini
     Shawn, and Scott. Two of these names are household
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 3003
    Spots technical references to input(s) or input-related fields in code and documentation.
    gpt-5-mini
     avformat_open_input(&fContextReadFrame
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 2971