© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair. Uses top 10 deduplicated activations.
    Recent Explanations
    mentions of colors and color specifications (color names, color codes, color spaces, and pixel/color values).
    gpt-5-mini
    the human heart.↵↵Praised be God, ye
    Neuronpedia logo
    GEMMA-3-270M
    15-GEMMASCOPE-2-RES-16K
    INDEX 3973
    the phrase "according to" (i.e., occurrences of the word "according," especially when followed by "to").
    gpt-5-mini
    <bos>Wait, but according to our earlier logic,
    Neuronpedia logo
    GEMMA-3-270M
    15-GEMMASCOPE-2-RES-16K
    INDEX 177
    The neuron detects mentions of information or data (especially references to personal/collected information, details, or requests for information).
    gpt-5-mini
    individual, gathering and remembering information about your preferences in order
    Neuronpedia logo
    GEMMA-3-270M
    15-GEMMASCOPE-2-RES-16K
    INDEX 135
    future intentions or plans.
    gemini-2.5-flash-lite
    on a vacation that was planned months ago.\n\nI have
    Neuronpedia logo
    LLAMA3.3-70B-IT
    50-RESID-POST-GF
    INDEX 1029
    the numeral 7, especially within phone numbers or numerical strings.
    gpt-5
    43-5678)  https://
    Neuronpedia logo
    GEMMA-3-27B-IT
    57-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 37411
    I cannot determine what this neuron is looking for based on the provided data, as it shows zero activation values across all tokens in every example.
    claude-4-5-haiku
    43-5678)  https://
    Neuronpedia logo
    GEMMA-3-27B-IT
    57-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 37411
    phrases related to conflicts of interest.
    gemini-2.5-flash-lite
    's integrity or creating conflicts of interest with portfolio companies
    Neuronpedia logo
    GEMMA-3-27B-IT
    46-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 66235
    tokens related to the structure of arguments, specifically premises and conclusions.
    gemini-2.5-flash-lite
    argument is invalid↵↵↵This argument is valid↵↵↵The conclusion
    Neuronpedia logo
    GEMMA-3-27B-IT
    37-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 136881
    coding-related tokens such as `password`, numerical digits, and technical terms like `bcrypt` and `admin`.
    gemini-2.5-flash-lite
    .hash('password123'),  # Store
    Neuronpedia logo
    GEMMA-3-27B-IT
    32-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 235328
    section headers and formatting common in structured AI-generated text.
    gemini-2.5-flash-lite
    , Udio, Riffusion** - Generate music
    Neuronpedia logo
    GEMMA-3-27B-IT
    1-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 106170
    informal terms of address and conversational interjections.
    gemini-2.5-flash-lite
    <bos><start_of_turn>user↵Yo man do you know anything about
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 51375
    phrases related to building structures or physical connections.
    gemini-2.5-flash-lite
    , man nimmt das Ganze Wasser auf der Welt und
    Neuronpedia logo
    GEMMA-3-27B-IT
    6-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 132247
    specific named components or parts within a description.
    gemini-2.5-flash-lite
    What parts make it up?↵    * **Functions
    Neuronpedia logo
    GEMMA-3-27B-IT
    13-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 43897
    present tense verbs ending in 'ing'.
    gemini-2.5-flash-lite
    -shaped areas designed to accommodate larger components.  The
    Neuronpedia logo
    GEMMA-3-27B-IT
    13-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 11888
    text describing creative work or fictional characters.
    gemini-2.5-flash-lite
    is a creature of whimsy and contradiction. Centuries of
    Neuronpedia logo
    GEMMA-3-27B-IT
    2-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 110105
    the concept of collapse, particularly in the context of wave functions or physical processes.
    gemini-2.5-flash-lite
    иза њега, истежући врат да
    Neuronpedia logo
    GEMMA-3-27B-IT
    50-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 197953
    tandem or paired systems.
    gemini-2.5-flash-lite
    efficiency even further (tandem cells).↵*   
    Neuronpedia logo
    GEMMA-3-27B-IT
    2-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 159486
    phrases related to community and togetherness.
    gemini-2.5-flash-lite
    = 5↵y = "Hello"↵print
    Neuronpedia logo
    GEMMA-3-27B-IT
    24-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 178045
    phrases related to user input and model responses in a conversational AI context.
    gemini-2.5-flash-lite
    you like - for example, under 18,
    Neuronpedia logo
    GEMMA-3-27B-IT
    10-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 48493
    specific named entities, often locations or organizations, along with descriptive terms.
    gemini-2.5-flash-lite
    Industriequartier - Industrial Quarter):**↵↵
    Neuronpedia logo
    GEMMA-3-27B-IT
    51-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 68194