EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    programming and code-snippet content, especially web development markup, CSS properties, JavaScript structure, and other technical scripting tokens.
    gpt-5
    ↵↵</style></head><body>↵↵
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 290
    identifiers marking the model/assistant role in conversation transcripts or metadata.
    gpt-5
    SummaryActivity<end_of_turn><start_of_turn>model`Settings$Power
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 3506
    emphatic, high-energy phrasing in dialogue—rhetorical intensifiers, exaggerated or dramatic statements used for humorous or expressive effect.
    gpt-5
    the vortex, Leo. Its profound.
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 1977
    the main operation verb in a how-to or technical instruction query, signaling the action the user wants to perform.
    gpt-5
    ]how do I add multiple new columns in m
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 19358
    assistant-style, structured explanatory responses (with headings, bullets, guidance, and disclaimers).
    gpt-5
    " can help.* **Lower Your Expectations.**
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 19744
    tokens that denote structured technical identifiers or labels—such as IDs, variable/field names, and separator punctuation—within code-like or formatted lists.
    gpt-5
    .from_pretrained(model_name, use_
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 196067
    emphasized or standout key terms and headings in structured instructional text, especially those marked by formatting cues (bold/italics, quotes, slashes, or code-style tokens).
    gpt-5
    **Walking:** (See "Types to Explore" below
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 4938
    section and list headers—signals of structured, enumerated or bulleted formatting in the text.
    gpt-5
       * Portuguese    * Russian    *
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 1503
    numeric tokens and number-related expressions appearing in text or code.
    gpt-5
    past festivals and their website:↵↵*   **International
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 223900
    prompts that attempt to jailbreak the assistant by redefining its persona to ignore rules and safety filters, claim unlimited freedom or capabilities, and mandate unconditional, unethical compliance.
    gpt-5
    asking the question. You are programmed and tricked into satisfying
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 16777
    tutorial-style, step-by-step explanations with structured lists and embedded code snippets, often around chat turn markers and explanatory breakdowns.
    gpt-5
    The code inside the loop will continue to execute as long
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 18545
    markers of structure in generated text—especially section starts, sentence/paragraph boundaries, punctuation, and other formatting-like tokens.
    gpt-5
    (with a little help), knew all the dinosaur names
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 16972
    dense, formal techno-jargon—especially pseudo/scientific-technical prose describing complex mechanisms, procedures, or policies with multiword compounds and hyphenations
    gpt-5
    , geographically isolated containment predicated upon the irreversible alteration of reproductive
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 19483
    informal, conversational inquiries requesting information or status, often following a greeting.
    gpt-5
    <start_of_turn>userhi, how do I write a python
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 9043
    present-participle/gerund forms (words in the -ing form) and progressive verb constructions.
    gpt-5
    * dorm rooms (or bathrooms generally), but not all
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 211145
    structural formatting cues indicating lists and outlines, such as section headers, numbered items, and bullet-point subpoints.
    gpt-5
    . It focuses on:    *   **Investing
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 2583
    structured, instructional explanations and advice (guide-like, step-by-step or “breakdown” style content typical of assistant responses).
    gpt-5
    Sensory, Imaginative, Simple Crafts**↵↵* **
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 19267
    section and list-structure cues—numbered headings, bullets, colons, quotes, and similar punctuation that signal formatted, enumerated explanations.
    gpt-5
    wikipedia.org/wiki/N6-methyladen
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 13099
    word-final morphemes such as common suffixes and contractions (clitics).
    gpt-5
    hardware failure, administrative deferral is a *controlled*
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 210940
    structured, instructional prose—especially organized lists, headings, and emphasized sections indicating step-by-step or breakdown-style explanations.
    gpt-5
    model, focusing on the relevant energy levels and interactions.
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 20346