EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    descriptions of serious bodily injury, especially in assault contexts, along with the resulting hospitalization, treatment, or legal ramifications.
    gpt-5
     so hard I broke his nose. Of course, he
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 89676
    opening quotation marks that signal the start of a direct quote or reported speech.
    gpt-5
     examine that group.↵↵"We need to investigate alternative
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 50889
    descriptions of physical hazards and injuries, especially accidents or animal attacks, and references to industrial safety/guarding measures that prevent them.
    gpt-5
    /TO) and machine guarding.↵↵Focus on Fundamentals
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 108839
    mentions of domestic or gender-based violence and associated responses—such as units, cases, survivors, and support services like shelters, advocacy, safety planning, and legal protections.
    gpt-5
    ), the DOMESTIC VIOLENCE UNIT, and the SEX
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 97954
    distinctive proper nouns and named entities (uncommon names of people, places, organizations, brands, or acronyms).
    gpt-5
    precancers respectively. [unreadable] [unreadable
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 58233
    Mentions of dogs, especially descriptions of dog body language, physical parts, and behavior in training or tracking contexts.
    gpt-5-mini
    tighten, and their tail carriage and ear positions will
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 20935
    Detects when a text describes an entity (especially non-human or artificial) as sentient, conscious, or otherwise an animate/agent-like being.
    gpt-5-mini
     and the ships AI unresponsive. Cait works
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 41648
    the neuron detects contracted/clitic tokens (apostrophes and the pieces of contractions like 'll, n't, 's, etc.).
    gpt-5-mini
    -making time. Youll have to choose what
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 22150
    numeric tokens and digit sequences in the text.
    gpt-5-mini
    stenJS:2014], which is beneficial
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 88563
    the neuron detects salient content words / topic nouns — prominent subject nouns or concept keywords in the text.
    gpt-5-mini
     knowing/discovering your purpose saves you so much time
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 31885
    standalone decimal numbers (floating-point values) appearing as isolated numeric tokens.
    gpt-5
    .-1<end_of_turn>
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 13364
    the neuron activates for prominent topical content words (key nouns or subject words) in the text.
    gpt-5-mini
    How to Write a Handwritten NoteOnly three or
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 108278
    the presence of apostrophes—tokens that are part of contractions or possessives (like 's, n't, 'm).
    gpt-5-mini
     me another difference. I'm<end_of_turn>
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 103702
    mentions of alcohol, alcohol use disorder, and related treatments/biomedical terms (e.g., disulfiram, naltrexone, blood ethanol/acetaldehyde).
    gpt-5-mini
    rosate or disulfiram*Program Name and
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 124345
    words that introduce a method, means, or instrument (e.g., prepositions like "by", "using", "through", "with" that signal how something is done).
    gpt-5-mini
     above, can be manipulated through direct silvicultural treatments
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 77441
    The neuron activates on verbs describing actions (especially past-tense and present-participial forms like "developing," "studied," "taking," "started," "ran/ran").
    gpt-5-mini
    sson↵↵I have been developing a technology which best can
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 50875
    The neuron detects salient content words — topic-bearing or focus words (important nouns, verbs, and adjectives) that carry the main meaning of a sentence.
    gpt-5-mini
     Epigenetic modifications play an important role during normal development
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 74379
    the neuron detects numeric tokens and quantity-related tokens (numbers, digits, percentages, and similar numeric expressions).
    gpt-5-mini
     Zambia has been without any kind of gamedepartment at
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 32059
    the start of a spoken utterance — an opening quotation mark or the beginning of direct dialogue.
    gpt-5-mini
     you read it.”↵↵Of course I read it
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 120736
    mentions of firefighting or extinguishing — references to fires, firefighting personnel, equipment, systems, or actions to put fires out.
    gpt-5-mini
     Temp Sensors↵↵best fire extinguishers buyers guide, wall
    Neuronpedia logo
    GEMMA-2-9B-IT
    9-GEMMASCOPE-RES-131K
    INDEX 6407