EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    clause-separating punctuation, especially commas and dashes within sentences and dialogue.
    gpt-5
    print("Board is full, no move is made.")↵
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 8469
    dialogue formatting, especially speaker name tags with colons, quoted speech, and bracketed stage directions.
    gpt-5
    green chaos drives.Shade: And I'm helping!
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 5569
    colons and quotation marks that indicate dialogue or character speech in scripts and conversations.
    claude-4-5-sonnet
    green chaos drives.Shade: And I'm helping!
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 5569
    nothing, as all activations are zero in these documents.
    claude-4-5-sonnet
    compile 'com.android.support:cardview-v7:2
    Neuronpedia logo
    QWEN3-4B
    19-TRANSCODER-HP
    INDEX 5569
    dense biomedical pharmacology descriptions of mechanisms of action—receptor/enzyme interactions, signaling pathways, and modulatory relations such as agonism, antagonism, and inhibition
    gpt-5
    researched mechanisms of CBD that could also decrease anxiety include:
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 69151
    references to specific test strings or identifiers (particularly "davidjl") being analyzed or manipulated in conversational exchanges.
    claude-4-5-sonnet
    between letters of the word davidjl<|im_end|><|im_start|>assistant
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 130789
    chat-style conversation scaffolding, especially role markers, prompt/instruction meta text, and assistant reply boilerplate within multi-turn dialogues
    gpt-5
    between letters of the word davidjl<|im_end|><|im_start|>assistant
    Neuronpedia logo
    QWEN2.5-7B-IT
    19-RESID-POST-AA
    INDEX 130789
    structural formatting tokens in conversational AI exchanges, particularly the header delimiters.
    claude-4-5-sonnet
    50 words)<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵As a
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 127533
    terms that denote relative order, position, or direction in time or space (e.g., comparative/positional adverbs and descriptors).
    gpt-5
    rlanırken Asil ve Yedek sayısı eşit
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 506
    words containing the letter 'f' or double consonants.
    claude-4-5-sonnet
    rlanırken Asil ve Yedek sayısı eşit
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 506
    the beginning of a text sequence or document.
    claude-4-5-sonnet
    <|begin_of_text|>provide real-world examples of
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 114327
    chat-conversation boundary markers and special formatting tokens (like start/end of turns and headers).
    gpt-5
    I can input at once<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵It
    Neuronpedia logo
    LLAMA3.1-8B-IT
    19-RESID-POST-AA
    INDEX 111894
    special tokens and markers that indicate conversational structure, particularly turn boundaries and role transitions in chat-formatted dialogue.
    claude-4-5-sonnet
    I can input at once<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵It
    Neuronpedia logo
    LLAMA3.1-8B-IT
    19-RESID-POST-AA
    INDEX 111894
    present tense verbs ending in "ing".
    gemini-2.5-flash-lite
    cds' template to insert the new cap into the file
    Neuronpedia logo
    GEMMA-3-270M-IT
    12-GEMMASCOPE-2-RES-65K
    INDEX 1359
    prominent content nouns—especially in Korean (and sometimes other non-English text)—that denote key entities, roles, or topics.
    gpt-5
    에서 가장적인 기업 하나 평가
    Neuronpedia logo
    GEMMA-3-27B-IT
    57-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 229039
    This neuron detects section‐header tokens that introduce or label parts of a prompt (e.g. “CONTEXT,” “TASK,” “Extract,” “Read,” “text”).
    o4-mini
    the question from the given context only and give Not Found
    Neuronpedia logo
    GEMMA-3-27B-IT
    25-GEMMASCOPE-2-TRANSCODER-262K
    INDEX 130386
    This neuron detects text produced by the assistant (assistant-role turns / assistant's replies and self-referential or corrective utterances).
    gpt-5-mini
    refined.<|im_end|><|im_start|>assistantYou are correct,
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 48739
    Instances of a question opening in the "How do I ..." form (i.e., the interrogative phrase that asks for instructions).
    gpt-5-mini
    on selected option↵↵How do I change a button URL
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 86260
    snippets of HTML/JavaScript used in cross-site scripting or other client-side injection attacks (e.g., <script>, onerror/onclick attributes, src/import URLs, alert/document.cookie).
    gpt-5-mini
    ')</script><style>@import url('https://example
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 85078
    tokens that appear in headings, titles, links or other prominent document-level metadata (e.g., subject lines, URLs, proper‑names).
    gpt-5-mini
    Check: Winter Wheat Agriculture on an Ice Age Steppe
    Neuronpedia logo
    QWEN2.5-7B-IT
    15-RESID-POST-AA
    INDEX 24107