EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    It detects the capitalized definite article "The", especially at the start of sentences or section/paragraph openings.
    gpt-5-mini
    -------------------------------------------↵↵The decomposition profiles of the four
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 28322
    mentions of the occurrence or onset of an event or symptoms (words/phrases indicating someone experienced something or when it happened).
    gpt-5-mini
     started leaking or when she experienced the first onset of symptoms
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 105505
    tokens that signal factual statements or methodological/results-related assertions in scientific/technical writing.
    gpt-5-mini
    I2C_RATE_3 | MANT
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 53566
    The neuron detects emphatic assertions that something is true or real—claims by the speaker insisting the information is genuine or not a joke.
    gpt-5-mini
     know that sounds crazy but true. Someone may say something
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 101301
    The neuron is looking for proper names and named-entity tokens (personal names and other capitalized entity words).
    gpt-5-mini
     want to write about Henrys cousin Jesse. I
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 72712
    Indicators of regulation or changes in expression level (mentions that a process or gene is up‑ or down‑regulated).
    gpt-5-mini
     and oxidative phosphorylation (OXPHOS) is downregulated
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 85763
    The neuron detects numeric and quantitative information — i.e., measurements, statistics and other quantitative expressions in scientific or technical text.
    gpt-5-mini
     showed a dose-dependent suppression<end_of_turn>
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 93494
    instances of the first-person pronoun "I" (self-references).
    gpt-5-mini
     does come from, because I'm very interested by
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 82493
    tokens that appear in technical or formatted contexts (code identifiers, XML/tags, section-heading or list-intro words and other emphasized/structural document tokens).
    gpt-5-mini
     Wikipedia, so take it for what its worth
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 73367
    sentences or phrases that are asking a question (especially wh‑words like "why/what/how" and other interrogative phrasing).
    gpt-5-mini
     encryption and certificates, so why would using private/public
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 78220
    spots first- and second-person pronouns and other self-/addressee-focused words (e.g., "I", "you", "we", "do") indicating speaker-directed or conversational language.
    gpt-5-mini
     pokemon fanfiction, or whatever you'd like to say
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 40786
    The neuron detects sentence-initial interrogative or conditional clause starters—i.e., the beginnings of questions or conditionals.
    gpt-5-mini
    <bos><start_of_turn>userWhen is it today?<end_of_turn>
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 47101
    the neuron detects salient content words—informative nouns/adjectives and topical keywords in the text.
    gpt-5-mini
    , a gate-to-source capacitance of the PM
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 40167
    It detects tokens that mark reported speech or attribution (e.g., words and phrases indicating someone says, told, claims, was asked, or was told).
    gpt-5-mini
     here I've been told that unnecessarily using sudo should
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 56649
    the neuron detects discourse or stance markers — short words that signal emphasis, evaluation, comparison, or framing (e.g., "truth", "clear", "more/than", "for", "into", "able").
    gpt-5-mini
     that they have learned much of anything from the2
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 116269
    phrases describing nausea, vomiting, diarrhea, or related motion/sea-sickness symptoms.
    gpt-5-mini
     known for making riders experience nausea. The Tilt-A
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 90400
    the neuron detects evaluative or normative language — words expressing judgments, approval/disapproval, permissibility, harm or suppression.
    gpt-5-mini
     concludes that it is NOT permissible to say, "I
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 1515
    The neuron detects tokens that introduce or point to important statements or key content (words that mark results, topics, or clause-leading discourse markers).
    gpt-5-mini
     b a') m r that "reflects" the
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 27228
    the neuron detects words and phrases expressing intention, willingness, ability, or deliberate action (volition).
    gpt-5-mini
     as you gradually let go of your beliefs, did the
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 116770
    statements expressing measurement, dependence, or causal/functional relationships in technical or scientific text.
    gpt-5-mini
     the initial longitudinal magnetization that is left after the dummy trains
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 20372