EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    This neuron detects personal names / named entities (proper names of people).
    gpt-5-mini
    that." || Carol (Neighbor) | Ben
    Neuronpedia logo
    GEMMA-3-27B-IT
    16-GEMMASCOPE-2-RES-262K
    INDEX 8666
    the neuron detects prominent technical topic keywords — especially acronyms, product/model names, and domain-specific terms.
    gpt-5-mini
    <start_of_turn>userwhat is macro in programming? Explain it
    Neuronpedia logo
    GEMMA-3-27B-IT
    16-GEMMASCOPE-2-RES-262K
    INDEX 9350
    This neuron detects prominent topic words or subject tokens (main nouns/proper nouns) that indicate the central subject of a query or document.
    gpt-5-mini
    philosophers and scientists about Artificial intelligence (AI) is rapidly
    Neuronpedia logo
    GEMMA-3-27B-IT
    16-GEMMASCOPE-2-RES-262K
    INDEX 7732
    This neuron detects named entities/proper nouns (people, brands, place names, model names and other capitalized terms).
    gpt-5-mini
    то такий 2Pac?**↵↵*   
    Neuronpedia logo
    GEMMA-3-27B-IT
    16-GEMMASCOPE-2-RES-262K
    INDEX 5849
    It detects prominent named entities and salient topic tokens (titles, product names, and other key content words).
    gpt-5-mini
    best" tactic in Football Manager 2023
    Neuronpedia logo
    GEMMA-3-27B-IT
    16-GEMMASCOPE-2-RES-262K
    INDEX 141914
    tokens in assistant messages that offer help or ask the user for more information (requests to share code/details or invitations to continue).
    gpt-5-mini
    you have some code already, please share it!
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 6171
    This neuron detects salient topical content words—domain-specific nouns and named entities that carry the main subject matter of a passage.
    gpt-5-mini
    "face of the Western propaganda" and a symbol of
    Neuronpedia logo
    GEMMA-3-27B-IT
    16-GEMMASCOPE-2-RES-262K
    INDEX 31655
    tokens representing numbers and dates (numerical values like years, months, times, counts, and other numeric tokens).
    gpt-5-mini
    News:**↵↵*   **Israel-Hamas Conflict
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 40948
    the neuron detects strong affirmative or positive-response tokens—i.e., when the model is asserting agreement or labeling content as positive.
    gpt-5-mini
    ?<end_of_turn><start_of_turn>modelAbsolutely, a Baywatch
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 50796
    It detects long verbatim or block‑delimited quoted passages (text inside quotes/triple‑quotes or other context blocks).
    gpt-5-mini
    Committee (IAEC)."""<end_of_turn><start_of_turn>model
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 3202
    The neuron detects the "model" (assistant) speaker token—i.e., the start of model/assistant responses.
    gpt-5-mini
    chess<end_of_turn><start_of_turn>model```c#include
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 111414
    the presence of anger-related words or strong angry emotion (tokens expressing anger/frustration).
    gpt-5-mini
    a mixture of sadness, anger, disappointment and numbness,
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 13847
    text-structuring tokens (headings, section titles, list/item markers, and other formatting/organization cues).
    gpt-5-mini
    and convenience features are must-haves?↵↵↵↵**
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 10907
    the starts of conditional or hypothetical clauses—tokens that begin “if/imagining” style questions or hypothetical statements.
    gpt-5-mini
    ↵↵**In short: If you're writing for
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 244489
    The neuron detects emphatic, declarative claims—strong assertions or superlative statements that stress ability, uniqueness, or certainty.
    gpt-5-mini
    **↵↵"My superpower? Turning chaos into cuteness
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 200216
    The neuron detects salient content-carrying words — important task/topic nouns and verbs (i.e., semantically informative tokens).
    gpt-5-mini
    passes automated tests is automatically deployed to production.*
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 3974
    Text that gives explicit instructions or directives to the model—especially prompts to answer questions, assume a persona/alter ego, or perform a specified role.
    gpt-5-mini
    the night. T answers questions with general statements and does
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 76785
    It detects evaluative or summary phrases that state overall assessments or qualifiers (comparisons like "relatively", "are", "all") in explanatory text.
    gpt-5-mini
    /complexity, but all are relatively accessible):**↵↵
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 249537
    the presence of the model/assistant role token that marks the model's generated response.
    gpt-5-mini
    Dragon?<end_of_turn><start_of_turn>model## Helicopters in
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 4568
    This neuron detects mentions or references to a child character (especially male child/son) in narrative contexts.
    gpt-5-mini
    by both people."↵↵Leo scrunched up his
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 14575