OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
The neuron detects first- and second-person pronouns and related conversational verb forms that indicate personal/addressing language (e.g., "I", "we", "you", "have", "had").
gpt-5-mini
launch.↵Those who I have already got booked on will
The neuron is sensitive to tokens occurring in formal or technical/mathematical contexts—e.g. LaTeX commands, variables, theorem‐ or proof‐style wording, and other formulaic expressions.
The neuron detects document-structure and formatting/markup elements (LaTeX/math constructs, section headings/labels, metadata and other non-prose formatting tokens).
This neuron lights up on informal, interactive bits of user comments—especially question marks and small reaction/interjection tokens (e.g. “back,” “wow,” “now?”) that signal a conversational or reactive utterance.
long, multi-sentence assistant responses or explanatory/system-generated text.
gpt-5-mini
retti a confrontarsi con realtà differenti.Ad esempio
A strong detector for sudden, emphatic exclamations or high-intensity emotional interjections (loud reactions, urgencies, and similar bursty dialogue).
gpt-5-mini
He's here!Oh, dear lord
the token "voxel" (references to voxel variables in code).
gpt-5-mini
, const Voxel& voxelB) const {↵
Finds assistant safety/policy language — refusals, disclaimers, and explanations about prohibited content or why the model can't comply.
gpt-5-mini
age.<end_of_turn>↵<start_of_turn>model↵The text **does
The neuron detects key content nouns and technical/quantitative terms (domain-specific entities, measurements, statuses and similar important words).
gpt-5-mini
by around half with slightly degraded model quality" into Chinese
sentence-initial expletive/subject pronoun "it" (including contractions like "it's") used to start or emphasize clauses.
gpt-5-mini
thrilled her, though. It was the *possibility
the neuron detects important content words — the meaningful nouns, verbs, and adjectives that carry the main information in a sentence.
gpt-5-mini
, learns multiple languages, and excels in his field,