Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APICircuit TracerNEWSteerSAE EvalsExportsSlackBlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    the neuron is looking for information about share certificates and ownership documentation.
    gpt-5-nano
    | Purchase ownership shares, earn dividends & capital growth.
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 88277
    sentences that report factual information, data, or research/findings (newsy, informational statements).
    gpt-5-mini
    training and activities to protect the ocean.↵“We aim to
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 11978
    mentions of gender and gender-focused discourse, especially in scholarly, analytical, or activist contexts.
    gpt-5
    Sexuality Studies** | Gender roles, identity, and
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 53982
    legal-style discussions of discrimination and protected grounds, especially definitions or analyses of unfair treatment based on identity characteristics
    gpt-5
    against a specific group based on protected grounds (such as
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 43941
    expressions that describe spatial containment or discreet placement, such as something being tucked or positioned within, behind, or between other objects.
    gpt-5
    There it was—tucked between a “To Kill
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 12954
    discussions about systemic gender and representation bias—especially in STEM, technology, and research design—highlighting male-centered systems that disadvantage women and advocating for greater diversity and inclusion.
    gpt-5
    and hardware is going to be designed by men. As
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 60780
    discussions of gender roles and stereotypes, especially societal expectations distinguishing boys and girls or men and women.
    gpt-5
    this pink for girls and blue for boys. (As
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 94961
    This neuron detects mentions of race and racial-group topics, especially content about racial identity, discrimination, representation, or related controversies.
    gpt-5-mini
    receives criticism, while changing a character from white to black
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 30563
    phrases containing numeric percentages (digits and the % sign), especially in finance-related contexts (yields, quoted prices, stock/bond figures).
    gpt-5-mini
    % stock" likely means a stock that pays an
    Neuronpedia logo
    GPT-OSS-20B
    15-RESID-POST-AA
    INDEX 19238
    descriptions of fixed-rate (percent) securities and related yield/price/face-value calculation problems in finance.
    gpt-5
    % stock" likely means a stock that pays an
    Neuronpedia logo
    GPT-OSS-20B
    15-RESID-POST-AA
    INDEX 19238
    It strongly detects the special start-of-text / start-of-sequence token (the <|startoftext|> marker).
    gpt-5-mini
    <|startoftext|>Sei un esperto
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 75938
    beginnings of documents, especially the special start-of-text marker indicating a new text segment.
    gpt-5
    <|startoftext|>Sei un esperto
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 75938
    the beginning-of-text control marker (i.e., the special token that denotes the start of a document/segment).
    gpt-5
    <|startoftext|>Sei un esperto
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 106286
    the start-of-text marker at the beginning of a segment.
    gpt-5
    <|startoftext|>Sei un esperto
    Neuronpedia logo
    GPT-OSS-20B
    11-RESID-POST-AA
    INDEX 58601
    words that begin with the letter sequence "Bro".
    gemini-2.5-flash
    <|begin_of_text|>Brooks Kubik is the
    Neuronpedia logo
    LLAMA3.1-8B-IT
    7-RESID-POST-AA
    INDEX 93282
    mentions of academic institutions and affiliations, especially university names and related titles.
    gpt-5
    . Vivian Fonseca:** (University of California, San Diego
    Neuronpedia logo
    GEMMA-3-4B-IT
    17-GEMMASCOPE-2-RES-65K
    INDEX 2095
    mentions of academic institutions and affiliations, especially “University of …” names, research labs, and spin‑off attributions after people or companies
    gpt-5
    University of Singapore (now National University of Singapore) and
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-65K
    INDEX 3498
    safety violation refusals involving sexual abuse, child exploitation, or severe content violations.
    deepseek-r1
    safety guidelines in multiple, severe ways. Here's
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-16K
    INDEX 12353
    mentions of Illinois-specific geography and civic identifiers (state, cities, capitals, ZIPs)
    gpt-5
    often trigger storms.↵    * **Dry Line:**
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 41307
    This neuron detects mentions of the United States or US-specific entities/contexts (e.g., "United States," "the Fed," US-focused topics).
    gpt-5-mini
    10 First Ladies of the United States, in order
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-262K
    INDEX 811