Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APIAssistant AxisNEWCircuit TracerNEWSteerSAE EvalsExports Community BlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    identifying content about Coca-Cola stock and global diversification as an investment topic.
    gpt-5-nano
    . It is therefore not surprising that issues of pathological other
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 27404
    the neuron is looking for directives that involve locating and discovering specific items.
    gpt-5-nano
    360, Super Nintendo) easily accessible from one central location
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 23358
    legal warranty and liability policy language.
    gpt-5-nano
    cents here. They have fountain pens, real fountain pens
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 101653
    spots proper nouns, names, organizations and header-like tokens (i.e., named entities and section/headline text).
    gpt-5-mini
    an African-American student of stealing his brother’s jacket.↵
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 23741
    Neuron 1: looks for actions or phrases in present participle form describing ongoing, dynamic activity. Neuron 2: looks for terms related to medical conditions or health risks. Neuron 3: looks for phrases about a sense of community and collective togetherness. Neuron 4: looks for content that promotes engagement and activism through social media and citizen journalism.
    gpt-5-nano
    an African-American student of stealing his brother’s jacket.↵
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 23741
    find vector magnitudes and dot products.
    gpt-5-nano
    23 Sat↵01/03/23 Sun↵02
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 67216
    The neuron detects the start-of-text token (i.e., the very beginning of a document).
    gpt-5-mini
    <|startoftext|>Sei un esperto
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 130362
    The neuron strongly responds to the start-of-text token, i.e., the beginning of a sequence.
    gpt-5-mini
    <|startoftext|>Sei un esperto
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 98351
    discussions of profanity and offensive speech, including meta-talk about speaking style and advice or warnings around using such language.
    gpt-5
    .  What might be acceptable among close friends could be
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 2497
    The neuron flags strong profanity—especially multi‐word or intensified swears (e.g. “God damn,” “fuck,” “cunt”)—marking when highly offensive curse phrases occur.
    o4-mini
    .  What might be acceptable among close friends could be
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 2497
    The neuron strongly activates on rare, domain-specific technical or proper-name tokens—e.g. specialized scientific jargon and unusual named entities.
    o4-mini
     when a 911 dispatcher advised him of the
    Neuronpedia logo
    GEMMA-2-2B
    3-GEMMASCOPE-TRANSCODER-16K
    INDEX 10237
    conversational openings and direct questions—often about identity/definitions or expressing worry about crime victimization—across both English and Chinese.
    gpt-5
    <bos>你好,你是谁?
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-16K
    INDEX 1295
    explicit hate speech and extremist, racially charged rhetoric, especially antisemitic and white-supremacist content.
    gpt-5
     like to see it?↵↵Formerly_Known_as
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-16K
    INDEX 572
    language that denies or minimizes responsibility or wrongdoing—refuting connections, claiming coincidence, or expressing skeptical/ironic dismissal.
    gpt-5
     nothing to do with it.↵↵And when many top
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-16K
    INDEX 1356
    marketing-focused guidance on creating effective lead magnets and landing pages.
    gpt-5-nano
    Discuss parenting style categories: authoritative, authoritarian, permissive
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 74548
    the start-of-text token (the very beginning of the document).
    gpt-5-mini
    <|startoftext|>Sei un esperto
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 109649
    phrases about the discovery and significance of a long-lost art masterpiece.
    gpt-5-nano
    <|startoftext|>Sei un esperto
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 109649
    the main thing this neuron does is find meaning of identity through relational, embodied, and spiritual context.
    gpt-5-nano
    <|startoftext|>Sei un esperto
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 46251
    instructions and setup guidance.
    gpt-5-nano
    Gear control panel.↵Note: Vendors, after you set
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 471
    structured, actionable guidance for preparing and performing well in a job interview, including frameworks and step-by-step guidance.
    gpt-5-nano
    Here's a comprehensive guide, broken down into stages
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 1010