Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APIAssistant AxisNEWCircuit TracerNEWSteerSAE EvalsExports Community BlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    descriptions of competency-based education and mastery-oriented progress in learning (advancement based on demonstrated competencies rather than time).
    gpt-5
    ‑time to the demonstration of mastery of explicit, observable
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 104038
    interrogative, quiz-style questions—especially sentences with question formatting, wh-words, and simple fact or math queries.
    gpt-5
    Invention of the iPhone Than to the Building of the Great
    Neuronpedia logo
    GEMMA-3-4B-IT
    11-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 244522
    The neuron fires strongly on numeric tokens—digits, numbers (including years), and related symbols (like “=”)—i.e. it’s looking for numerals/math expressions.
    o4-mini
    Invention of the iPhone Than to the Building of the Great
    Neuronpedia logo
    GEMMA-3-4B-IT
    11-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 244522
    This neuron detects mentions of mathematical operators and related numeric-constraint terms in quiz instructions.
    o4-mini
    together with add or subtract either.↵4. the
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-262K
    INDEX 5580
    The neuron detects mentions of clinical symptoms and symptom-describing phrases in medical text, especially respiratory symptoms and related clinical descriptors.
    gpt-5-mini
    uritic chest pain can occur in patients and presents as
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 1991
    This neuron identifies mentions of pulmonary symptoms—especially dyspnea and different types of cough.
    o4-mini
    uritic chest pain can occur in patients and presents as
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 1991
    The neuron detects informative, content-heavy words (longer nouns/verbs and section-heading tokens) — i.e., key informational terms in instructional or explanatory text.
    gpt-5-mini
    and how this company differentiates itself.↵* **The
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 633
    The neuron strongly activates on the names of software products, platforms, frameworks, or technical components (e.g. “WhatsApp,” “Android,” “.NET,” “MVVM/WPF,” etc.).
    o4-mini
    turn on dark mode on WhatsApp for Android, follow these
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 2514
    It detects self-referential statements where the model talks about itself (first-person identity, creation, capabilities, or "my"/"I" statements).
    gpt-5-mini
    ):**  The body is incredibly efficient. When it
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 9135
    The neuron detects emphatic or evaluative modifier words—strong adjectives and adverbs like “very,” “important,” “positive,” “negative,” etc.
    o4-mini
    - Accept a challenge with a positive attitude↵  T
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 7345
    structured, list-style formatting—especially numbered items with bolded headings, product/brand names, acronyms, and section dividers.
    gpt-5
    voices, not the most luxurious feel.↵    *
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 813
    This neuron activates on copular or auxiliary “to be” verbs (is/are/will be/etc.), flagging statements that define or assert something.
    o4-mini
    . Automated builds and tests are run to detect integration issues
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 424
    tokens containing the letter “Z” (upper- or lowercase), especially when it appears at the start of names or terms.
    gpt-5
    stability.↵* **Zephyr Holt:** "Ze
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 1339
    The neuron fires strongly on genre labels—especially darker ones—such as “Horror,” “Mystery,” “dark,” or “scary.”
    o4-mini
    , Mystery, Romance, Horror, Slice of Life,
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 3759
    statements and headings that frame structured analysis or troubleshooting, signaling problem identification, core issues, challenges, breakdowns, and considerations.
    gpt-5
    **1. The Core Problem: Copyright and Removal**
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 433
    Sections that present a problem-analysis and solution/troubleshooting advice (headings like “The Core Problem,” reasons, and what to do).
    gpt-5-mini
    **1. The Core Problem: Copyright and Removal**
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 433
    This neuron spots words and phrases that introduce or label problems—like “issue,” “breakdown,” “core problem,” or other signals that a difficulty is being explained.
    o4-mini
    **1. The Core Problem: Copyright and Removal**
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 433
    strongly negative, complaint-style review language indicating dissatisfaction with a product, service, or experience.
    gpt-5
     Kitchen. Not recommended at all. Lethargic service
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-16K
    INDEX 9764
    first-person, autobiographical statements expressing personal experience, thoughts, or preferences within informal explanations or advice.
    gpt-5
     size smalls. We used disposables about half the
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-16K
    INDEX 14915
    critical evaluations of media that call out contrivance or unrealistic, overly neat/predictable elements, often marked by intensifiers and evaluative qualifiers.
    gpt-5
    , a dark (sometimes ludicrously so) crime saga
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-16K
    INDEX 15153