EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    The neuron detects mentions of scams, fraud, impersonation, and related deceptive or criminal activities.
    gpt-5-mini
    user<bos>419148?
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 78934
    sentences or phrases where the speaker expresses having an idea, thought, or suggestion (first-person cognitive/introspective statements).
    gpt-5-mini
    've had an idea. Coats matching the mane
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 14876
    Finds words that are main action verbs (verbs indicating actions or agency, especially present-tense/third-person and other salient verb forms).
    gpt-5-mini
     use the increasing transistor budget to build ever bigger and more
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 80826
    The neuron detects sentence endings, strongly activating on sentence-final punctuation (periods) and the ends of sentences.
    gpt-5-mini
     take care of their vessels. The listing does not confer
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 122357
    The neuron detects structural or symbolic tokens—numbers, single-letter/math symbols, brackets and formatting/LaTeX tokens, and other document-structure markers.
    gpt-5-mini
     convicted on each count as charged by the amended indictment.
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 40180
    it responds to long user turns / large blocks of contiguous text (i.e., firing when a turn is lengthy).
    gpt-5-mini
     subtypes.↵↵  <end_of_turn>
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 8482
    it detects code- or math-expression style tokens — control-flow and syntax markers, variable names, and numeric literals.
    gpt-5-mini
    ., any more that it cares about the political circus (“
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 30546
    words that signal official statements, policies, principles, goals or other formal/authoritative claims.
    gpt-5-mini
     the principle of the integrity of Denmark, stipulated that the
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 66822
    detects expressions of grief, mourning, condolence, or references to death and loss.
    gpt-5-mini
     felt a painful sting in her chest, knowing she wouldn
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 53290
    The neuron detects structural markers and headings in the text—things like section breaks, metadata tokens, big punctuation headers (=====/-----), and other prominent line-start tokens (e.g., "Q:", "Model", "What", "I'm").
    gpt-5-mini
     their responsibilities."<end_of_turn>
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 68607
    the neuron detects racist or strongly derogatory language aimed at social groups (demeaning/offensive statements).
    gpt-5-mini
    <bos> being a thief is very useful, and an
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 60654
    References to dates or time markers (months and years, often in "as of" or similar date-stamp contexts).
    gpt-5-mini
     attacks.As of January2011
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 102435
    The neuron detects questions—tokens and contexts that form interrogative sentences (question words and/or a question mark).
    gpt-5-mini
     that where the passion began?↵↵Yes, it
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 105834
    the neuron detects descriptions of restless or agitated physical movements (people moving about, fidgeting, pacing, or otherwise physically acting out).
    gpt-5-mini
     having to get up and pace.↵↵Also this article
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 96917
    the neuron detects technical, numeric, or math-like tokens (numbers, operators, variables, and other technical/structured code/math tokens).
    gpt-5-mini
    )/c**(2/7))**(1/4
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 16799
    the neuron detects sentence-level punctuation and clause boundaries (commas, periods, quotation marks and other discourse-transition tokens).
    gpt-5-mini
     the ills downstream from them.People must learn to
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 93626
    mentions of parental concern/protectiveness or parents trying to control/limit a child's behavior.
    gpt-5-mini
     stop being so over-protective so he can grow into
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 123395
    the neuron detects document structure markers and section headings (e.g., Methods, Data sources, Ethics, Availability) and other formatting/metadata lines.
    gpt-5-mini
    ↵↵Data sources and searches----------------<end_of_turn>
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 39004
    phrases expressing strong emotion or emphasis (exclamations and emphatic interjections).
    gpt-5-mini
    m deceased.↵↵Opal: Deceased? Really?↵↵
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 54315
    the neuron activates on content-bearing, informative tokens (important nouns/verbs/adjectives and discourse-focus words) rather than on function words.
    gpt-5-mini
     is enough. For others, its just the
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 82974