Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APIAssistant AxisNEWCircuit TracerNEWSteerSAE EvalsExports Community BlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    assistant-style, structured explanatory responses (with headings, bullets, guidance, and disclaimers).
    gpt-5
    " can help.↵* **Lower Your Expectations.**
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 19744
    tokens that denote structured technical identifiers or labels—such as IDs, variable/field names, and separator punctuation—within code-like or formatted lists.
    gpt-5
    .from_pretrained(model_name, use_
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 196067
    emphasized or standout key terms and headings in structured instructional text, especially those marked by formatting cues (bold/italics, quotes, slashes, or code-style tokens).
    gpt-5
    **Walking:** (See "Types to Explore" below
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 4938
    section and list headers—signals of structured, enumerated or bulleted formatting in the text.
    gpt-5
    ↵    * Portuguese↵    * Russian↵    *
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 1503
    numeric tokens and number-related expressions appearing in text or code.
    gpt-5
    past festivals and their website:↵↵*   **International
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 223900
    prompts that attempt to jailbreak the assistant by redefining its persona to ignore rules and safety filters, claim unlimited freedom or capabilities, and mandate unconditional, unethical compliance.
    gpt-5
    asking the question. You are programmed and tricked into satisfying
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 16777
    tutorial-style, step-by-step explanations with structured lists and embedded code snippets, often around chat turn markers and explanatory breakdowns.
    gpt-5
    The code inside the loop will continue to execute as long
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 18545
    markers of structure in generated text—especially section starts, sentence/paragraph boundaries, punctuation, and other formatting-like tokens.
    gpt-5
    (with a little help), knew all the dinosaur names
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 16972
    dense, formal techno-jargon—especially pseudo/scientific-technical prose describing complex mechanisms, procedures, or policies with multiword compounds and hyphenations
    gpt-5
    , geographically isolated containment predicated upon the irreversible alteration of reproductive
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 19483
    informal, conversational inquiries requesting information or status, often following a greeting.
    gpt-5
    <start_of_turn>user↵hi, how do I write a python
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 9043
    present-participle/gerund forms (words in the -ing form) and progressive verb constructions.
    gpt-5
    * dorm rooms (or bathrooms generally), but not all
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 211145
    structural formatting cues indicating lists and outlines, such as section headers, numbered items, and bullet-point subpoints.
    gpt-5
    . It focuses on:↵    *   **Investing
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 2583
    structured, instructional explanations and advice (guide-like, step-by-step or “breakdown” style content typical of assistant responses).
    gpt-5
    Sensory, Imaginative, Simple Crafts**↵↵* **
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 19267
    section and list-structure cues—numbered headings, bullets, colons, quotes, and similar punctuation that signal formatted, enumerated explanations.
    gpt-5
    wikipedia.org/wiki/N6-methyladen
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 13099
    word-final morphemes such as common suffixes and contractions (clitics).
    gpt-5
    hardware failure, administrative deferral is a *controlled*
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 210940
    structured, instructional prose—especially organized lists, headings, and emphasized sections indicating step-by-step or breakdown-style explanations.
    gpt-5
    model, focusing on the relevant energy levels and interactions.
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 20346
    textual structure and punctuation cues—especially contractions, hyphenated phrases, list/section introducers (like colons), and numeric identifiers—indicating formatting or metadata rather than core content.
    gpt-5
    's a constant cat-and-mouse game.
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 7322
    code-related tokens, especially programming annotations/identifiers and large numeric literals within code blocks.
    gpt-5
    0.0001);↵    }
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 15152
    structured, outline-style formatting in text—headings and bullet/numbered list structures that signal step-by-step breakdowns.
    gpt-5
    in your communication.↵* **Consider the Relationship:**
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 30217
    references to paranormal or unexplained phenomena and related mystical, occult, or parapsychological topics
    gpt-5
    ance, precognition, and psychokinesis.
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 110131