OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
The neuron detects first- and second-person pronouns and related conversational verb forms that indicate personal/addressing language (e.g., "I", "we", "you", "have", "had").
gpt-5-mini
launch.↵Those who I have already got booked on will
The neuron is sensitive to tokens occurring in formal or technical/mathematical contexts—e.g. LaTeX commands, variables, theorem‐ or proof‐style wording, and other formulaic expressions.
The neuron detects document-structure and formatting/markup elements (LaTeX/math constructs, section headings/labels, metadata and other non-prose formatting tokens).
This neuron lights up on informal, interactive bits of user comments—especially question marks and small reaction/interjection tokens (e.g. “back,” “wow,” “now?”) that signal a conversational or reactive utterance.
A strong detector for sudden, emphatic exclamations or high-intensity emotional interjections (loud reactions, urgencies, and similar bursty dialogue).