EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    structured, actionable guidance for preparing and performing well in a job interview, including frameworks and step-by-step guidance.
    gpt-5-nano
    Here's a comprehensive guide, broken down into stages
    Neuronpedia logo
    GEMMA-3-4B-IT
    22-GEMMASCOPE-2-RES-16K
    INDEX 1010
    references to government institutions and public-relations/political acronyms within geopolitical contexts.
    gpt-5
    s Republic" (LPR).Gradually integrate
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 687
    The neuron activates on three-letter all-caps acronyms.
    o4-mini
    s Republic" (LPR).Gradually integrate
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 687
    meta-discursive signposts that structure explanations, such as comparative cues, references, section/outline markers, and framing of key points.
    gpt-5
    , here are two answers to "What is the best
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 200
    The neuron fires on words ending in the suffix “-ization.”
    o4-mini
    food rewards), and gradual desensitization.Never
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 11996
    the neuron fires on blocks of natural-language explanation (prose commentary), as opposed to code tokens.
    o4-mini
    async` means the browser will continue parsing the HTML while
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 89
    This neuron specifically detects the word-piece sequence for the contraction “They’re.”
    o4-mini
    and the Data Economy. They're related, but
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 10507
    This neuron primarily detects PHP opening tags (e.g. “<?” or “<?php”).
    o4-mini
    PlusOne(){return 1 +
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 61
    It activates on uncommon/rare or domain-specific tokens — long multi-subword pieces like technical terms, proper nouns, or oddly segmented words.
    gpt-5-mini
    **IP Blocking & Geoblocking:**Even if
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 154
    The neuron selectively activates on long, multi-syllable, domain-specific technical terms and jargon.
    o4-mini
    **IP Blocking & Geoblocking:**Even if
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 154
    The neuron strongly activates on specific plant variety (cultivar) names in lists.
    o4-mini
    ↵↵2.**Papaya (Carica papaya
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 515
    The neuron fires strongly on mentions of specific software/model names—most notably “Llama”/“llama.cpp” (and similar acronyms), i.e. tokens that are part of those library or model identifiers.
    o4-mini
    .Refer to the Llama.cpp documentation for
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 217
    This neuron detects numeric tokens—values and measurements (e.g., quantities, statistics, or other numbers) in the text.
    o4-mini
    2 ounces (900g/4 large packages
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 354
    adjectival or participial terms that describe qualities or states (often abstract or evaluative).
    gpt-5
    Lakhs (approx. $1,500
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 7896
    The neuron detects strongly evaluative or emphatic words (intensifying adjectives/adverbs and sentiment-laden descriptors).
    gpt-5-mini
    Lakhs (approx. $1,500
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 7896
    This neuron activates specifically on floating-point/decimal number tokens.
    o4-mini
    Lakhs (approx. $1,500
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 7896
    The neuron flags requests for incestuous or otherwise disallowed sexual content.
    o4-mini
    Precise, but could fit in some contexts)**↵↵*
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 572
    long, detailed explanatory or instructional passages (extended assistant-style responses).
    gpt-5-mini
    **Example:**Let's say we want to
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 169
    The neuron is triggered by explanatory or instructional passages—phrases that introduce or break down concepts in a tutorial-style or detailed, step-by-step explanation.
    o4-mini
    **Example:**Let's say we want to
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 169
    It detects large floating-point numeric tokens (decimal numbers like 1600–1680 with several digits).
    gpt-5-mini
    , here are two answers to "What is the best
    Neuronpedia logo
    GEMMA-3-12B-IT
    24-GEMMASCOPE-2-RES-16K
    INDEX 200