Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APICircuit TracerNEWSteerSAE EvalsExportsSlackBlogPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    proper names or acronyms (like LRT, Lyndon Johnson, LFS253, Lockheed Martin).
    claude-3-7-sonnet-20250219
    , that is—built LRTs. And some pretty
    Neuronpedia logo
    GEMMA-2B-IT
    12-RES-JB
    INDEX 15823
    references to Texas and Texas-related proper nouns or institutions.
    o3
     conducted by undergraduate students positively impacts Texas—and Texans—
    Neuronpedia logo
    GEMMA-2-2B
    22-CLT-HP
    INDEX 74186
    It activates for short, high-frequency function words—articles, conjunctions, prepositions, pronouns, etc.—that appear preceded by a space.
    o3
    /ach/api/) for Moov ACH endpoints.
    Neuronpedia logo
    GEMMA-2-2B
    0-CLT-HP
    INDEX 40780
    text that describes details or characteristics, particularly in the form of descriptive sentences or paragraphs.
    claude-3-7-sonnet-20250219
     than they might have otherwise.↵↵Some of the stand
    Neuronpedia logo
    GEMMA-2B-IT
    12-RES-JB
    INDEX 10009
    contextual cues or transitions in text, particularly those indicating the start of explanations, definitions, or important points in a passage.
    claude-3-5-sonnet-20240620
     than they might have otherwise.↵↵Some of the stand
    Neuronpedia logo
    GEMMA-2B-IT
    12-RES-JB
    INDEX 10009
    phrases or tokens related to concluding thoughts or final answers in mathematical or logical reasoning problems.
    claude-3-5-sonnet-20240620
    boxed{B}↵</think
    Neuronpedia logo
    DEEPSEEK-R1-DISTILL-LLAMA-8B
    11-LLAMASCOPE-SLIMPJ-OPENR1-RES-32K
    INDEX 8420
    numbers and time-related measurements, especially those associated with dates, lunar phases, and solar events.
    gpt-4.1-mini-2025-04-14
     esbats are traditionally tied to the lunar cycles. Together
    Neuronpedia logo
    GEMMA-2-9B
    32-GEMMASCOPE-RES-131K
    INDEX 129881
    This neuron activates on C#/.NET code declarations—specifically namespace import (“using”) lines and assembly‐attribute annotations in source files.
    o4-mini
    ("")]↵[assembly: AssemblyCulture("")]
    Neuronpedia logo
    GEMMA-2-2B
    1-CLT-HP
    INDEX 12928
    This neuron fires on named bodies of water—especially seas and oceans (and related maritime place names).
    o4-mini
     Black Sea to the Caspian Sea. The railway would allow
    Neuronpedia logo
    GEMMA-2-2B
    1-CLT-HP
    INDEX 11298
    This neuron activates on clause- or phrase-boundary commas (and the word tokens immediately before those commas).
    o4-mini
    at St James' Park, after which he said:
    Neuronpedia logo
    GPT2-SMALL
    8-RES-JB
    INDEX 55
    The neuron responds to expository passages packed with quantitative details—numbers, measurements, dates, statistics, and other technical or factual specifications.
    o4-mini
     pumps may be manual devices powered by hand or foot movements
    Neuronpedia logo
    GEMMA-2B-IT
    12-RES-JB
    INDEX 757
    proper nouns and terms related to patented inventions or intellectual property.
    deepseek-r1
     pumps may be manual devices powered by hand or foot movements
    Neuronpedia logo
    GEMMA-2B-IT
    12-RES-JB
    INDEX 757
    information about financial and numerical figures, particularly dollar amounts and numerical measurements.
    claude-3-7-sonnet-20250219
     pumps may be manual devices powered by hand or foot movements
    Neuronpedia logo
    GEMMA-2B-IT
    12-RES-JB
    INDEX 757
    narrative prose describing characters' actions or movements in a story.
    claude-3-7-sonnet-20250219
    , and handed it to the queen, and then took
    Neuronpedia logo
    GEMMA-2B-IT
    12-RES-JB
    INDEX 1468
    content related to gaslighting and emotional manipulation.
    gpt-4o
    their own feelings, instincts, and sanity, which gives
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 16205
    words related to wealth, status, or privilege.
    gpt-4o
    is the fate of great genius to go often unreward
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 130649
    the concept of performing harmful or unethical actions, particularly in relation to destruction or dangerous behavior.
    gpt-4o
    that are peaceful or get in the way of mass-de
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 87027
    instances where the word "where" is used to introduce or indicate a particular clause or explanation.
    gpt-4o
    your transaction. This is where things can get a little
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 130193
    the transition between scenes or dialog turns, particularly in text involving interactive conversation or role-playing elements.
    gpt-4o
    rewrite the same response as if the speaker were scott
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 86110
    The neuron responds to numeric tokens—especially the position-numbers in chemical names (e.g. “2,” “3,” “4,” “2,6,” etc.)—i.e. digits indicating substituent positions in compound nomenclature.
    o4-mini
    lic acid, 2,3-dibromopy
    Neuronpedia logo
    LLAMA3.1-8B-IT
    11-RESID-POST-AA
    INDEX 21849