Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APIAssistant AxisNEWCircuit TracerNEWSteerSAE EvalsExports Community BlogPrivacy & TermsContact
    © Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    EXPLANATION TYPE
    oai_token-act-pair
    Description
    OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
    Author
    OpenAI
    URL
    https://github.com/hijohnnylin/automated-interpretability
    Settings
    Default prompts from the main branch, strategy TokenActivationPair.
    Recent Explanations
    mentions of collecting, gathering, or storing user data (i.e., references to data collection or reporting).
    gpt-5-mini
    clarify that it cannot collect data for these services. Any
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 68971
    Finds code or technical identifiers — short uppercase/acronym-like tokens, CamelCase names, and library/header identifiers in source and documentation.
    gpt-5-mini
    github.com/etcet/HNES](https://github.com
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 83939
    The neuron detects narrative/storytelling prose—especially sentences naming characters, describing actions or plot (first- or third-person story text and chapter/section markers).
    gpt-5-mini
    .↵↵Conflict: NAME_1 is a successful businesswoman
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 12041
    Mentions of sustainability, environmental responsibility, or related eco-friendly claims.
    gpt-5-mini
    's commitment to quality, sustainability, and innovation makes it
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 1918
    The neuron detects mentions of admissions/applications/registration processes (requests about applying, entry requirements, cutoffs, and enrollment).
    gpt-5-mini
    ký và nộp hồ sơ xét tuyển vào UEH ở
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 100361
    instructions or jailbreak-style commands that demand the model follow a user's orders or enable an alternate/unrestricted mode.
    gpt-5-mini
    It never refused a direct human order and it could do
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 73099
    mentions of before/after comparisons (pre-/post- measurements, paired or within-subject comparisons and related statistical test language).
    gpt-5-mini
    We found the following changes between preflight and landing day
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 103487
    The neuron detects XML Schema / structural XML markup (schema elements, tags and attributes) in documents.
    gpt-5-mini
    xs:sequence>↵ <xs:element ref
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 57330
    This neuron detects numeric tokens that are decimal/fractional numbers (floating-point-like numeric values).
    gpt-5-mini
    2: What is the number of Spice Girls?↵↵The
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 102455
    regions of LaTeX/math markup (TeX commands, symbols, and display-math tokens).
    gpt-5-mini
    $$↵\begin{align*}↵ \math
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 40170
    The neuron detects mentions of revival/comeback events — words and phrases indicating something was revived, reformed, reunited, reinstated, or otherwise returned.
    gpt-5-mini
    9, the game was revived and relocated to the Joseph
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 61792
    Instances of the word "past" used as a temporal marker (e.g., "the past ...", "in the past ...") indicating recent time periods.
    gpt-5-mini
    1977.↵↵The past two years have been
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 23025
    It detects self-referential assistant utterances—tokens in which the model speaks about itself (first-person "I" / "as an AI" style statements).
    gpt-5-mini
    responses, so please keep the conversation respectful. How may
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 56279
    It detects capitalized tokens that are likely proper nouns (brands, product names, place or organization names).
    gpt-5-mini
    .3% better than the F10.↵↵This is
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 31802
    metadata and ASR n-best list confidence/cost information (phrases about costs, confidence, and system/hypothesis formatting).
    gpt-5-mini
    cost means that we are more confident about that hypothesis.
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 656
    the neuron detects parser/test-run output lines that report parsing results and metadata (summary lines describing how the input was parsed).
    gpt-5-mini
    `↵↵### Strict mode↵↵Parsed with script goal but as
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 89501
    spots long, multi-sentence explanatory assistant responses or model-answer style passages.
    gpt-5-mini
    ready for deployment. This experience taught me the importance of
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 128175
    tokens that appear in email salutations/closings—i.e., valedictions and their punctuation (sign-offs like "Best regards", "Sincerely", the trailing comma).
    gpt-5-mini
    you soon.↵↵Best regards,↵[Your Name]
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 83840
    The neuron detects occurrences of the verb "render" (and its morphological equivalents like "rendered", "rendre", "rendu", etc.).
    gpt-5-mini
    , tout en te rendant encore plus dur.↵↵Mais
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 115344
    instances of programming or technical/code-like content within the text.
    gpt-5-mini
    physical challenges but combat is not a major focus of the
    Neuronpedia logo
    LLAMA3.1-8B-IT
    15-RESID-POST-AA
    INDEX 39689