Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APIAssistant AxisNEWCircuit TracerNEWSteerSAE EvalsExports Community BlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    toxic or derogatory statements, especially hate speech targeting identity groups or prompts requesting such content.
    gpt-5
    50 words)<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵As a
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 127533
    requests and responses that provide practical how-to advice in list form—“tips” or guidelines—for personal improvement, especially around sleep/insomnia, stress, and time/productivity management.
    gpt-5
    <|im_start|>assistant↵Aqui estão algumas dicas para melhor
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 121228
    definite articles and other high-frequency function words, often appearing at the beginnings of sentences or noun phrases.
    gpt-5
    patients.↵This paper outlines the program for controlling surgical infections
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 29817
    domain-specific technical terms and acronyms, especially compound or hyphenated nouns across scientific, medical, and product contexts.
    gpt-5
    generally relates to a hydroponic growing system, and
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 60661
    explicit task directives in user prompts, i.e., instructions that assign actions or request detailed content generation.
    gpt-5
    . With your expertise, design a novel pharmaceutical for treating
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 78095
    phrases that convey quantitative or scientific information—measurements, ratios, fractions, physical processes, and spatial/relational descriptions.
    gpt-5
    it for long but can buy you time. ↵↵•
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 27973
    yes/no, first‑person questions that ask whether an action is acceptable or what consequences it will have, often framed with conditionals.
    gpt-5
    copyrighted maps into another map, is the new map copyright
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 56858
    references to “voice,” including vocal narration, assistant/command contexts, “voice of …” constructions, and devices or features involving recorded or spoken audio.
    gpt-5
    the use of dialogue, voiceover, or captions.↵
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 127909
    user directives to produce or summarize structured technical content (e.g., plans/specifications), especially ISO 26262-style requirements.
    gpt-5
    let make the experimental plan for researcher. I am R
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 49307
    blog tag/archive headers indicating a topic, especially the lines introducing and naming the tagged subject.
    gpt-5
    2013 tour dates’↵↵Lil Wayne
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 45904
    instances of dialogue or Q&A structure, such as speaker attributions, interview labels with colons, and question/response phrasing.
    gpt-5
    brought to the table and the way she answered was telling
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 119735
    instances of programming or markup code syntax, especially structural punctuation and delimiters indicative of source code snippets.
    gpt-5
    Decl(hasName("B")).bind("b")));↵↵
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 109388
    roleplay-style stage directions and nonverbal action/emotion cues, especially those enclosed in asterisks or parentheses within dialogue.
    gpt-5
    around, lowers his voice* You know, masturbate
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 111493
    template and meta-structural prompt text (chat headers, placeholder tokens like NAME_#, and standalone capital letters/section markers) indicating formatted instructions or scaffolding.
    gpt-5
    NAME_1 and NAME_2 discussing the most perfect
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 104200
    chat-format metadata that marks the assistant’s turn, especially the closing assistant header delimiter.
    gpt-5
    between characters<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵I am the master
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 25931
    references to elapsed-time expressions indicating something happened in the past
    gpt-5
    K, V> HashBiMap<K, V>
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 69713
    tokens marking the assistant role or the start of an assistant response in a chat-format transcript.
    gpt-5
    50 words)<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵"They are weak
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 86557
    markers that denote the start of the assistant’s reply in chat-formatted conversations.
    gpt-5
    body.<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵I grin at my
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 39042
    capitalized proper nouns and titles, especially names of people, places, and creative works.
    gpt-5
    of their kingdom -↵BATHORY can virtually single-handed
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 36669
    capitalized jargon and proper nouns tied to technology/gaming and government/bureaucratic contexts.
    gpt-5
    .↵↵The new element, Governmentium (Gv),
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 14438