Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APICircuit TracerNEWSteerSAE EvalsExportsSlackBlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    Java code declarations that use access modifiers (e.g., method, constructor, or field signatures in declaration lines).
    gpt-5
    private LayoutInflater inflater;↵↵ public MyObjectArrayAdapter(Context
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 104186
    negation words or discussions of negative aspects, absences, and missing information.
    claude-4-5-haiku
    'info', 'code'.<|eot_id|><|start_header_id|>assistant<|end_header_id|>↵↵
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 111081
    punctuation marks and special characters used for document formatting and structural separation.
    claude-4-5-haiku
    t '. ; ; '. t - ; a
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 80948
    Long hexadecimal-like GUIDs or long alphanumeric asset identifiers (serialization GUIDs) in file headers.
    gpt-5-mini
    guid: f871f5933d984534fb
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 123424
    tokens representing numbers or numeric values.
    gpt-5-mini
    can specify a negative number for the rotation.↵↵Here is
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 54108
    sentences where the model talks about itself — explaining its capabilities, limits, memory/context/window/token constraints, or other self‑referential assistant disclaimers.
    gpt-5-mini
    like a really helpful, but forgetful, assistant.
    Neuronpedia logo
    GEMMA-3-27B-IT
    40-GEMMASCOPE-2-RES-262K
    INDEX 26035
    words related to resource management and capacity.
    gemini-2.5-flash-lite
     "We've had to increase employees. . .
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 1024
    constructions expressing the act of examining or directing attention to something, especially the “look … at” pattern across tenses and phrasings.
    gpt-5
     have to look at the↵manner in which
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 1527
    mentions of annual or yearly rates, metrics, or recurring events and meetings.
    gpt-5
    ="ref"}↵↵Our annualized total mortality rate for
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 7455
    mentions of freshness, typically describing food or newly obtained/just-made items.
    gpt-5
    ↵↵Anyone up for some fresh seafood in Kuala Lumpur?
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 6350
    adjectival references to nationalities, languages, or countries (demonyms/ethnonyms).
    gpt-5
     with an outdoor picnic, Japanese music, and haiku poetry
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 12541
    copyright notices and publication metadata in web content.
    deepseek-r1
    , August 27th, 201
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 6550
    graphic, sensory depictions of liquids being expelled or spilled—especially messy bodily fluids and leakage scenes.
    gpt-5
    ush of fluids came spewing onto the floor. It
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 642
    boilerplate metadata and headers, including post timestamps and filing info, RSS/feed notices, comment/ping status, and copyright/license notices.
    gpt-5
    , August 27th, 201
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 6550
    instructions in technical or math text that direct the reader to compute or compare values (e.g., evaluate, calculate, determine which is greater, find a greatest common divisor)
    gpt-5
    [1,red, my label={A}↵
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 2129
    phrases indicating an action or state spanning an extent, especially over a time period, quantity, or repeated occurrence.
    gpt-5
    ↵collaborate on this over the weekend? Should be
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 11368
    intensified evaluative descriptors—adjectives/adverbs conveying strong degree or notable qualities like complexity, uniqueness, severity, or chaos.
    gpt-5
    dots,m$. Very similar computations to those in [@
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 7557
    special typographic spacing and formatting characters, especially nonstandard whitespace used in tables, lists, and metadata.
    gpt-5
    SGPALS 1--4, score
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 12448
    instructions or suggestions to check or verify something (including references to checking docs or conditions)
    gpt-5
     well.↵you should check if you have specified the
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 6150
    auxiliary and modal function words that signal passive voice, possibility/necessity, and connective structure in formal or technical prose.
    gpt-5
     purposes. A questionnaire was used to collect data from individuals
    Neuronpedia logo
    GEMMA-2-9B
    21-GEMMASCOPE-RES-16K
    INDEX 7163