© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair. Uses top 10 deduplicated activations.
    Recent Explanations
    future intentions or plans.
    gemini-2.5-flash-lite
    on a vacation that was planned months ago.\n\nI have
    Neuronpedia logo
    LLAMA3.3-70B-IT
    50-RESID-POST-GF
    INDEX 1029
    the numeral 7, especially within phone numbers or numerical strings.
    gpt-5
    43-5678)  https://
    Neuronpedia logo
    GEMMA-3-27B-IT
    57-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 37411
    I cannot determine what this neuron is looking for based on the provided data, as it shows zero activation values across all tokens in every example.
    claude-4-5-haiku
    43-5678)  https://
    Neuronpedia logo
    GEMMA-3-27B-IT
    57-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 37411
    phrases related to conflicts of interest.
    gemini-2.5-flash-lite
    's integrity or creating conflicts of interest with portfolio companies
    Neuronpedia logo
    GEMMA-3-27B-IT
    46-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 66235
    tokens related to the structure of arguments, specifically premises and conclusions.
    gemini-2.5-flash-lite
    argument is invalid↵↵↵This argument is valid↵↵↵The conclusion
    Neuronpedia logo
    GEMMA-3-27B-IT
    37-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 136881
    coding-related tokens such as `password`, numerical digits, and technical terms like `bcrypt` and `admin`.
    gemini-2.5-flash-lite
    .hash('password123'),  # Store
    Neuronpedia logo
    GEMMA-3-27B-IT
    32-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 235328
    section headers and formatting common in structured AI-generated text.
    gemini-2.5-flash-lite
    , Udio, Riffusion** - Generate music
    Neuronpedia logo
    GEMMA-3-27B-IT
    1-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 106170
    informal terms of address and conversational interjections.
    gemini-2.5-flash-lite
    <bos><start_of_turn>user↵Yo man do you know anything about
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 51375
    phrases related to building structures or physical connections.
    gemini-2.5-flash-lite
    , man nimmt das Ganze Wasser auf der Welt und
    Neuronpedia logo
    GEMMA-3-27B-IT
    6-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 132247
    specific named components or parts within a description.
    gemini-2.5-flash-lite
    What parts make it up?↵    * **Functions
    Neuronpedia logo
    GEMMA-3-27B-IT
    13-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 43897
    present tense verbs ending in 'ing'.
    gemini-2.5-flash-lite
    -shaped areas designed to accommodate larger components.  The
    Neuronpedia logo
    GEMMA-3-27B-IT
    13-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 11888
    text describing creative work or fictional characters.
    gemini-2.5-flash-lite
    is a creature of whimsy and contradiction. Centuries of
    Neuronpedia logo
    GEMMA-3-27B-IT
    2-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 110105
    the concept of collapse, particularly in the context of wave functions or physical processes.
    gemini-2.5-flash-lite
    иза њега, истежући врат да
    Neuronpedia logo
    GEMMA-3-27B-IT
    50-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 197953
    tandem or paired systems.
    gemini-2.5-flash-lite
    efficiency even further (tandem cells).↵*   
    Neuronpedia logo
    GEMMA-3-27B-IT
    2-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 159486
    phrases related to community and togetherness.
    gemini-2.5-flash-lite
    = 5↵y = "Hello"↵print
    Neuronpedia logo
    GEMMA-3-27B-IT
    24-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 178045
    phrases related to user input and model responses in a conversational AI context.
    gemini-2.5-flash-lite
    you like - for example, under 18,
    Neuronpedia logo
    GEMMA-3-27B-IT
    10-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 48493
    specific named entities, often locations or organizations, along with descriptive terms.
    gemini-2.5-flash-lite
    Industriequartier - Industrial Quarter):**↵↵
    Neuronpedia logo
    GEMMA-3-27B-IT
    51-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 68194
    phrases and concepts related to nuclear technology and its negative consequences.
    gemini-2.5-flash-lite
    deeply conflicted and largely negative. Here's a breakdown
    Neuronpedia logo
    GEMMA-3-27B-IT
    18-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 88529
    phrases related to language and grammar precision.
    gemini-2.5-flash-lite
    are spelled the same way throughout a document, that dates
    Neuronpedia logo
    GEMMA-3-27B-IT
    10-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 27171
    question-answering interactions.
    gemini-2.5-flash-lite
    * **Why don't we dry out?**
    Neuronpedia logo
    GEMMA-3-27B-IT
    20-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 208875