Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APICircuit TracerNEWSteerSAE EvalsBlogSlackPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    np_token-act-pair-logits
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows. Newer modifications May 2025: Show model the top positive logits, and ask model to be more concise and omit things like "phrases related to...".
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Modified version of OpenAI's token activation pair. Modifications: show model the top positive logits, and ask model to be more concise and omit things like "phrases related to...".
    Recent Explanations
    this neuron activates for languages other than English
    gemini-2.0-flash
     in the world, and drops Reenie at home,
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16366
    Although
    gemini-2.0-flash
     in science and technology. Although the specific effects of sequestration
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16265
    Mathematical notation
    gemini-2.0-flash
    0 = |u\rangle |0\rangle \
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16260
    short code snippets
    gemini-2.0-flash
    * Converts the type and the subtype of the parsed media
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16276
    network
    gemini-2.0-flash
    ierkiewicz v. Sorema↵N. A.,
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16266
    data and anonymity
    gemini-2.0-flash
     quantitatively the same results (data are not shown).↵↵
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16259
    dates and numerical ranges
    gemini-2.0-flash
    4 months from June to October 2018
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16284
    Legal citations
    gemini-2.0-flash
     Kan. 77, 79, 
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16279
    capitalized words in sports articles
    gemini-2.0-flash
     Cureton sparked a second half comeback to pull out a
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16283
    code
    gemini-2.0-flash
     scientists, this is first discovery in Southeast Asiaon Pla
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16270
    gibberish/randomness
    gemini-2.0-flash
     can't." A rumbling, huge memory from the
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16313
    code
    gemini-2.0-flash
    -----------------\n" <<↵                 "Array Size (
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16377
    technical/mechanical descriptions
    gemini-2.0-flash
     rod handle in positive angular engagement with each other about a
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16346
    seemingly random tokens in running text
    gemini-2.0-flash
     Turkmenistan national team beat the tournament hosts Nepal (0–
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16355
    our/us and nearby words
    gemini-2.0-flash
    ↵hassles↵↵Visit our Services Page to see a
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16338
    seemingly random text and code snippets
    gemini-2.0-flash
     White supervisor out of the equation, especially, and next
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16340
    words present in questions and video game reviews
    gemini-2.0-flash
     of horror for me. Horror is about the creeping shiver
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16288
    Unclear, but it may be a period followed by one or more common words or characters used in HTML
    gemini-2.0-flash
     child-friendly events.↵↵We are still in the
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16299
    unclear
    gemini-2.0-flash
     machine all the time. “They help each other out
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16378
    the letter [any capital letter]
    gemini-2.0-flash
     the sample size and the letter indicates whether the sample had
    Neuronpedia logo
    GEMMA-2-2B
    16-GEMMASCOPE-TRANSCODER-16K
    INDEX 16363