Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APICircuit TracerNEWSteerSAE EvalsExportsSlackBlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    specialized, compound jargon terms—especially technical computing/electronics vocabulary and references to input keys and authentication.
    gpt-5
    79)↵Nightseasons (Pittsburgh: Carnegie Mellon
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 11322
    hexadecimal-style codes and memory/address dump entries in structured, code-like data.
    gpt-5
    x4147, code:
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 514
    programming/code snippets, especially identifiers and syntax from class definitions, namespaces, annotations, and build/config directives.
    gpt-5
    ↵ */↵class LanguageTranslate extends \yii\db
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 15328
    The neuron detects section headings and document-structure markers (titles, numbered list entries and other section-start tokens) across languages.
    gpt-5-mini
    "El Arte de la Ilusión" de Charles Baxter
    Neuronpedia logo
    GEMMA-3-1B-IT
    17-GEMMASCOPE-2-RES-16K
    INDEX 556
    The neuron activates on content-bearing tokens (meaningful nouns/verbs/identifiers and keywords) rather than on common function words or punctuation.
    gpt-5-mini
    # Create the buttons↵select_input_button =
    Neuronpedia logo
    GEMMA-3-1B-IT
    17-GEMMASCOPE-2-RES-16K
    INDEX 11815
    This neuron detects explicit instructions and formatting or response directives in system or prompt text.
    gpt-5-mini
    numeric characters in your reply. What's the smallest
    Neuronpedia logo
    GEMMA-3-1B-IT
    17-GEMMASCOPE-2-RES-16K
    INDEX 12616
    References to legal cases, formal court headings, or courtroom/judgment context (case names, "vs.", "The Court", legal-issue headings).
    gpt-5-mini
    Bharat) Ltd. vs. Union of India judgment,
    Neuronpedia logo
    GEMMA-3-1B-IT
    17-GEMMASCOPE-2-RES-16K
    INDEX 2165
    This neuron strongly detects mentions of short-term rental properties and related review/booking context (references to an Airbnb or property listing and requests to write or describe a stay).
    gpt-5-mini
    a glowing review for an Airbnb located near downtown St.
    Neuronpedia logo
    GEMMA-3-1B-IT
    17-GEMMASCOPE-2-RES-16K
    INDEX 592
    The neuron detects the assistant's discourse-opening phrases that begin explanations or step‑by‑step guides (e.g., "Okay, let's break down", "Here's a ... guide").
    gpt-5-mini
    code<end_of_turn>↵<start_of_turn>model↵```python↵from datac
    Neuronpedia logo
    GEMMA-3-1B-IT
    17-GEMMASCOPE-2-RES-16K
    INDEX 1198
    instances of conversation/control tokens and turn boundaries (e.g., <start_of_turn>, system/model markers) — i.e., structural / meta tokens indicating speaker turns or formatting.
    gpt-5-mini
    1. Understanding the Problem**↵↵The core task is
    Neuronpedia logo
    GEMMA-3-1B-IT
    17-GEMMASCOPE-2-RES-16K
    INDEX 160
    Tokens containing an apostrophe/typographic single-quote (contractions and possessives like I’m, don’t, ’s, etc.).
    gpt-5-mini
     questioning everything and it doesn’t take me long to
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 7692
    Phrases that signal a narrated account — words like "story", "tale", "recounted"/"told"/"recalled"/"recount" and other reporting verbs that introduce someone telling an account.
    gpt-5-mini
     She recounted with pride the tale of how she won Player
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 1975
    the names of groups, collections, or multi-part structures (words indicating pairs/triplets/components/members/complexes/systems).
    gpt-5-mini
    -sealing the whole members. After resin-se
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 4688
    mentions of people, human roles, groups, or agents (e.g., staff, resident, owner, officer, person).
    gpt-5-mini
    TR framework and evaluated with staff evaluations of 80
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 5392
    text-structure and formatting markers (section headings, separators and other layout/markup artifacts).
    gpt-5-mini
    tions}↵==============================↵↵In this
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 6001
    first-person subjective statements expressing the writer's opinions, preferences, or list-making (e.g., "I", "I'm", "I'll", "my", "picks", "list", "include").
    gpt-5-mini
     so can't obviously include that for 20
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 9490
    The neuron detects code-like programming syntax (tokens that signal code structure, especially variable/constant declarations and other keywords).
    gpt-5-mini
    formedResponseException {↵    APINodeList<
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 3274
    mentions of spies, intelligence agencies, undercover agents, and related espionage or national-security terms.
    gpt-5-mini
     series of films about an FBI agent.↵↵It was
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 9215
    positions of speaker labels and colons that mark turns in a transcript (e.g., "Q:", "A:", or other colon-delimited speaker markers).
    gpt-5-mini
     creation process?↵↵A: My inspiration to write comes
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 650
    the presence of function words and discourse/connective markers (common stopwords, sentence-level connectors, and punctuation that signal sentence structure or transitions).
    gpt-5-mini
    woing a proper diet according to your body needs.
    Neuronpedia logo
    GEMMA-2-2B
    12-GEMMASCOPE-RES-16K
    INDEX 422