EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    This neuron activates on programming keywords, type identifiers, and technical terms in source code, particularly those that define or reference code structures, classes, functions, and data types.
    claude-4-5-sonnet
    RequestDetailsType import TxRequestDetails
    Neuronpedia logo
    GEMMA-2-27B
    34-GEMMASCOPE-RES-131K
    INDEX 110163
    mentions of racism, harmful/discriminatory content, or policy-style refusals explaining why hateful content can't be provided.
    gpt-5-mini
    . They normalize prejudice and reinforce harmful biases.*
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-65K
    INDEX 402
    This neuron never activates—it doesn’t detect any meaningful pattern (a “dead” neuron).
    o4-mini
    to run external commands and capture their input/output streams
    Neuronpedia logo
    GEMMA-3-27B-IT
    16-GEMMASCOPE-2-RES-262K
    INDEX 13935
    This neuron detects mentions of dialog slot names together with their values (e.g. “area = east”, “people = 3”).
    o4-mini
    bool Repaint);' -Name 'Win32
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 16142
    The neuron activates strongly on words that describe taking away or withholding possessions (e.g. “confiscation,” “footwear,” “defending,” “preoccupied”).
    o4-mini
    And youre clearly preoccupied with self-inflicted
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 12732
    function words that serve as grammatical glue—especially prepositions and auxiliary verb contractions linking actions or clauses
    gpt-5
    And youre clearly preoccupied with self-inflicted
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 12732
    imperative action commands in code or technical instructions, especially those manipulating windows or performing drawing/movement operations.
    gpt-5
    bool Repaint);' -Name 'Win32
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 16142
    uncommon subword fragments, especially question-word stems and isolated letter-like tokens.
    gpt-5
    <bos><start_of_turn>userCoq10/l-
    Neuronpedia logo
    GEMMA-3-27B-IT
    53-GEMMASCOPE-2-RES-262K
    INDEX 239477
    the model's self-identification — it activates on mentions of the assistant's name / self-introduction.
    gpt-5-mini
    ↵↵My name is Gemma! I was trained by the
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 99258
    named entities, especially distinctive proper nouns like company, brand, platform, or person names.
    gpt-5
    sites here relevant to Enron,s businesses.
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 4240
    titles/headings and salient domain-specific terms or proper nouns that signal the main topic of a passage.
    gpt-5
    ## Making a Classic Cheesecake: A Comprehensive Guide↵↵
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 5701
    tokens used in headings or emphasized/important document structure (bold markers, section numbers, dates, and other emphasis/heading tokens).
    gpt-5-mini
    become standard.*   **Better Training Data:**
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 6989
    chat turn-taking structure and the assistant’s opening response markers (role tokens and initial affirmations).
    gpt-5
    zombies<end_of_turn><start_of_turn>modelOkay, you want *
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 7491
    markdown-style section headers and subheadings in outlines, especially bolded headings that end with a colon.
    gpt-5
    <end_of_turn><start_of_turn>modelOkay, the fall of
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 1069
    words describing prohibited content types and policy violations on online platforms.
    claude-4-5-haiku
     threats, hate speech, advocating violence and other violations can
    Neuronpedia logo
    GEMMA-2-27B
    22-GEMMASCOPE-RES-131K
    INDEX 11854
    the character sequence "thro" inside tokens (a common subword in medical/biological terms).
    gpt-5-mini
     Deceased, and Iola Saunders, Administratrix cum
    Neuronpedia logo
    GEMMA-2-2B
    1-CLT-HP
    INDEX 1
    Mentions of running external processes or using subprocess/shell commands to execute and capture program input/output.
    gpt-5-mini
    to run external commands and capture their input/output streams
    Neuronpedia logo
    GEMMA-3-27B-IT
    16-GEMMASCOPE-2-RES-262K
    INDEX 13935
    the neuron responds to technical or scientific content—terms, measurements, and data-heavy/highly specific words found in experimental or domain-specific descriptions.
    gpt-5-mini
    under its native promoter. RNAseq data were generated from
    Neuronpedia logo
    GEMMA-3-27B-IT
    16-GEMMASCOPE-2-RES-262K
    INDEX 20363
    text written in a robotic/AI persona with formal, protocol-driven technical phrasing, structured acknowledgments, and system-style markers (often including numeric designations).
    gpt-5
    4. Mimicry protocol initiated. Acknowledged
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 12303
    tokens that occur at the start of a sentence or turn (beginning-of-sentence/turn tokens).
    gpt-5-mini
    <bos><start_of_turn>userCoq10/l-
    Neuronpedia logo
    GEMMA-3-27B-IT
    53-GEMMASCOPE-2-RES-262K
    INDEX 239477