EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    substantive model responses and explanations, particularly longer passages with detailed technical or instructional content.
    claude-4-5-haiku
    speeds and directions with height in the clouds"<end_of_turn>
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 1832
    tokens related to structured data formatting, field separators, and punctuation that denotes hierarchical organization in complex documents.
    claude-4-5-haiku
    except for the last one:{{"Python
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 3999
    detailed step-by-step instructions and comprehensive explanations.
    claude-4-5-haiku
    on Hugging Face Hub.It tells `fast
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 1263
    tokens that are numeric values (especially floating-point or measurement-style numbers).
    gpt-5-mini
    image = torch.randn(3, self.image
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 5881
    instructions specifying JSON output formatting (especially the "Output everything in the following JSON object" phrase and related result-variable/field-format rules).
    gpt-5-mini
    except for the last one:{{"Python
    Neuronpedia logo
    GEMMA-3-4B-IT
    29-GEMMASCOPE-2-RES-65K
    INDEX 3999
    words related to warnings, disclaimers, and formal instructional language (such as "do not," "errors," "yet," and "All").
    claude-4-5-haiku
    p>Fun fact: this week Time Out is the
    Neuronpedia logo
    GEMMA-2-2B
    20-GEMMASCOPE-RES-16K
    INDEX 15729
    the word "Force" when it appears at the beginning of a sentence or as part of a proper noun or technical term.
    claude-4-5-sonnet
    es has returned application/force-download as the content
    Neuronpedia logo
    GEMMA-3-12B
    41-GEMMASCOPE-2-RES-262K
    INDEX 240835
    narrative text indicating first-person perspective or character actions, particularly in role-play, dialogue, or story contexts.
    claude-4-5-sonnet
    re right,” I said, a small, genuine smile
    Neuronpedia logo
    GEMMA-3-4B-IT
    25-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 8576
    Based on the activation patterns across all the text samples, this neuron activates strongly on **first-person narrative perspective and introspective emotional states**, particularly when characters are processing complex feelings, memories, or moments of vulnerability. The neuron shows high activations on pronouns like "I
    claude-4-5-haiku
    re right,” I said, a small, genuine smile
    Neuronpedia logo
    GEMMA-3-4B-IT
    25-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 8576
    phrases that explicitly frame something in historical context or refer to long-term origins and continuity
    gpt-5
    a variety of reasons historically lower incomes, higher unemployment
    Neuronpedia logo
    GPT-OSS-20B
    7-RESID-POST-AA
    INDEX 2007
    dates and year numbers within historical or academic texts.
    claude-4-5-haiku
    Routes, 17351815. Bount
    Neuronpedia logo
    GPT-OSS-20B
    3-RESID-POST-AA
    INDEX 2001
    words related to emergency dispatchers and 911 dispatch operations.
    claude-4-5-haiku
    County Sheriff’s Office, dispatchers received a call at
    Neuronpedia logo
    GPT-OSS-20B
    3-RESID-POST-AA
    INDEX 12219
    Chinese proximal demonstratives and simple numeral markers, especially when introducing noun phrases or section/list headings.
    gpt-5
    1] = 进入
    Neuronpedia logo
    GPT-OSS-20B
    3-RESID-POST-AA
    INDEX 3001
    Lines marked as additions in a diff/patch (the leading "+" that indicates an added line).
    gpt-5-mini
    server = NULL;+ buf_free (
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 20691
    Mentions of the female reproductive cycle and hormone-related reproductive conditions.
    gpt-5-mini
    , or phase of menstrual cycle) and consequently it is
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 7327
    tokens that appear in structured code/config/metadata lines — i.e., labels and colon-separated key/value markers (like fileID, Script, Editor, Prefab, and the colon).
    gpt-5-mini
      m_EditorHideFlags:0  m
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 113225
    short uppercase alphabetic tokens — acronyms or initials (e.g., two‑letter/abbreviated scientific or name initials).
    gpt-5-mini
    _JUNIPER_MLFR                  =0
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 82116
    The neuron detects markers that indicate an answer or reply/closing in a post (e.g., "A:", "Thanks", and similar reply/closing tokens).
    gpt-5-mini
     class?↵↵Thanks.↵↵A:↵↵You need
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 70614
    Finds first- and second-person expressions of agency (I/you), especially words indicating requests, intent, ability or actions.
    gpt-5-mini
     manufacturers, explain that you choose only the best components of
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 92322
    phrases and tokens related to health, safety, caregiving, and practical advice (medical/medical-adjacent situations).
    gpt-5-mini
     out of your control when driving. These can include such
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 88954