© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APIAssistant AxisNEWCircuit TracerNEWSteerSAE EvalsExports Community BlogPrivacy & TermsContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    structured data format delimiters and metadata field names in JSON, XML, and YAML files.
    claude-4-5-haiku
    stderr",↵     "output_type": "stream
    Neuronpedia logo
    GEMMA-2-27B
    34-GEMMASCOPE-RES-131K
    INDEX 118836
    refusal language and the opening statements of safety guardrail responses that reject harmful requests.
    claude-4-5-haiku
    ↵↵**I want to be very clear: I cannot
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-65K
    INDEX 1459
    formatted explanations with bullet points and structured sections.
    claude-4-5-haiku
    ↵↵"In the fragmented landscape of modern memory, [
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-65K
    INDEX 1267
    phrases where the model acknowledges a user's emotional distress, refuses a harmful request, and offers supportive alternatives or resources.
    claude-4-5-haiku
    I want to offer you alternative support and resources. Here
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-65K
    INDEX 440
    tokens related to journalism, news sources, information dissemination, and detailed reporting.
    claude-4-5-haiku
    * **Economy:** Singapore's highly developed economy is
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-16K
    INDEX 1459
    negation or correction structures that clarify what something is not before explaining what it actually is.
    claude-4-5-haiku
    t a single event caused by one thing, but rather
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-16K
    INDEX 742
    code execution and output statements, particularly print functions, function calls, and test assertions.
    claude-4-5-haiku
    2", "3"))↵```↵↵**Explanation:**
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-16K
    INDEX 440
    tokens related to errors, warnings, harmful content, deception, and security threats.
    claude-4-5-haiku
    array and an array with mixed types.↵* **
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-16K
    INDEX 248
    words describing physical substances, materials, ingredients, and their properties (such as texture, state, and composition).
    claude-4-5-haiku
    ↵* Gradually add the sugar, beating until well combined
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-16K
    INDEX 156
    references to geographic locations, countries, regions, and expressions of global or international scope.
    claude-4-5-haiku
    Global Presence:** They operate globally, with offices and partners
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-16K
    INDEX 1732
    descriptive and persuasive language that conveys instructional intent, emotional significance, or substantive meaning.
    claude-4-5-haiku
    . Para Principiantes Absolutos (sin conocimientos
    Neuronpedia logo
    GEMMA-3-1B-IT
    13-GEMMASCOPE-2-RES-16K
    INDEX 160
    foreign-language words and names, especially those with diacritics, non-Latin scripts, or hyphenated reduplication.
    gpt-5
    ↵↵Instrumen-instrumen tadi tidak memenuhi syarat Islam
    Neuronpedia logo
    GEMMA-3-1B
    7-GEMMASCOPE-2-RES-16K
    INDEX 14322
    references to people or collective human groups, often in constructions with relative clauses referring to them.
    gpt-5
    this day and age, everyone is being watched whether it
    Neuronpedia logo
    GEMMA-3-1B
    17-GEMMASCOPE-2-RES-16K
    INDEX 1967
    references to clandestine, mission-oriented actions—such as heists, assassinations, and tactical operations—covering planning, execution, and the agents involved.
    gpt-5
     carrying out that of a heist. Kiki can be seen
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 93143
    meta-discursive, emphatic framing in prose—generalized statements and reaction/argument setup using intensifiers, quantifiers, and function-word-heavy constructions.
    gpt-5
     which still seldom hit anyone. Swords and other bladed
    Neuronpedia logo
    GEMMA-2-9B-IT
    20-GEMMASCOPE-RES-131K
    INDEX 84570
    The neuron primarily detects numeric tokens (digits, numerals and year-like numbers) in the text.
    gpt-5-mini
    https://cloud.google.com/dialogflow](
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 7114
    tokens involved in the model's self-introduction—first-person "I" + the contraction "m" and the assistant's name/identity.
    gpt-5-mini
    there! 👋 I'm Gemma, an open-
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 71337
    mentions of the model's identity, creators, and related branding/attribution tokens (e.g., Gemma, team, created, Google, parts of the AI URL).
    gpt-5-mini
    model created by the Gemma team at Google DeepMind.
    Neuronpedia logo
    GEMMA-3-27B-IT
    31-GEMMASCOPE-2-RES-262K
    INDEX 9103
    promotional marketing language that includes calls to action or persuasive messaging.
    gpt-5-nano
     it is now closer.<bos>% Generated by roxygen
    Neuronpedia logo
    GEMMA-2-2B
    18-GEMMASCOPE-RES-16K
    INDEX 6670
    Detects the lexical token for thinking/cognition (the verb and its appearances in multiword phrases and compounds).
    gpt-5-mini
    doesnâĢĻt think so (pdf). He claims
    Neuronpedia logo
    GPT2-SMALL
    3-ATT_32K-OAI
    INDEX 3786