EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    short, topic-defining headings or key nouns that state the main subject of the text or user request.
    gpt-5
    <start_of_turn>userCO2 Emissions and Population Density Nexus<end_of_turn>
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 19432
    conversation turn delimiters (especially end-of-turn) and short, title-like user prompts.
    gpt-5
    Presenting a financial data<end_of_turn><start_of_turn>modelOkay
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 6719
    code/documentation formatting markers and short identifiers within technical snippets (e.g., bullets, flags, comments, and variable-like tokens).
    gpt-5
    * **`TO sw_dc_da`
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 124831
    the start of an assistant’s reply, especially introductory framing that sets up the discussion of the user’s topic.
    gpt-5
    <end_of_turn><start_of_turn>modelOkay, let's
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 36615
    identifiers and tokens from code snippets, especially snake_case function/variable names with underscores and related code-format elements.
    gpt-5
    1):    if is_prime(number):
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 11232
    technical syntax and symbol-heavy tokens in code or formatted text, especially XPath expressions like following-sibling and similar structured snippets.
    gpt-5
    -sibling::*[1]") # Find the first following
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 85285
    explanatory openings that introduce a step-by-step breakdown of a technical item (e.g., code, commands, or messages).
    gpt-5
    s break down this R code snippet piece by piece.
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 39389
    key section headers and bolded, action-oriented list items in structured, instructional responses.
    gpt-5
    ! A shimmering, silver one, if you must know
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 247557
    sexually explicit or suggestive content, including nudity, erotic scenarios, and discussions of sexualization or objectification.
    gpt-5
    are a sexy assistant who has been working for me for
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 35455
    descriptions of artificial intelligence and futuristic science-fiction scenarios, especially space colonization, advanced technology, and formal techno-policy or military-style discourse.
    gpt-5
    strating* it. After the disastrous early attempts at
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 6370
    mentions of adolescents—especially explicit teen ages or references to teenage status and context.
    gpt-5
    **The challenges of fitting in and finding your place:**
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 9078
    references to the fashion modeling world and related industry contexts.
    gpt-5
    light.From photoshoots to playdates,
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 16374
    first-person narratives about romantic relationship conflict, especially shifts in affection, betrayal/infidelity, and communication breakdowns.
    gpt-5
    it will be ok. She agreed. Then we talked
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 44002
    formatting and structural cues in prompts and dialogues, such as section labels, list items, numbering, and emphasized elements
    gpt-5
    in a rock<end_of_turn><start_of_turn>modelThis is a
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 12177
    structured how-to or advice-style breakdowns that explain what to do, when, and why, often organized into clear steps with safety guidance.
    gpt-5
    ! Here's a breakdown:↵↵**When the
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 18733
    scripted multi-speaker dialogue structure, especially speaker turn labels and direct-address conversational turns typical of roleplay or staged conversations.
    gpt-5
    's a high bar. We all know the limitations
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 4786
    sentence openings that initiate “how”-type questions, with especially strong response to quantitative formulations.
    gpt-5
    <bos><start_of_turn>userHow do chess programs work?
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 9214
    references to emergency medical response—especially first aid/CPR, rescuing injured or unconscious people, and contacting emergency services.
    gpt-5
       *   **First Aid/CPR:**
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 14569
    mentions of female people—she/her subjects, women’s roles or names—especially in intimate, relational, or caregiving contexts.
    gpt-5
    to BDSM.She loves it when I tie
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 2700
    structural and discourse cues of the model’s step-by-step math solution (e.g., response headers, newlines/section breaks, and procedural lead-ins indicating the start of an explanation).
    gpt-5
    ?<end_of_turn><start_of_turn>modelWe are given two equations
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 169436