OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
numeric list markers and words that introduce or structure enumerations.
claude-4-5-sonnet
a resource with the same or higher classification level. (
words indicating positive evaluation, correctness, or satisfactory status.
claude-4-5-sonnet
regulatory licenses and is generally compliant in
elements of formal news article or report formatting, including proper nouns, location names, attribution verbs, and structural markers.
claude-4-5-sonnet
↵↵**HILLSDALE, CA -** Chaos erupted
chain-of-thought reasoning and explicit step-by-step problem-solving.
claude-4-5-haiku
model↵Inner dialog: Okay, this is a simple
situations of imminent danger or public emergencies—such as missing persons, crimes, or alerts—and guidance to contact law enforcement or take immediate safety action.
gpt-5
immediately if you believe someone is missing.**↵↵There
important semantic content words that carry key meaning in a passage.
claude-4-5-haiku
start of their journey what trials lay in store, none
common grammatical function words and articles like "a," "the," "to," "of," and "be."
claude-4-5-sonnet
States:**This is the biggest economy in the world
formatted text structure and section boundaries, particularly spaces, periods, and markers that divide or organize content into distinct parts.
claude-4-5-sonnet
Pole) (Difficulty:1/5)**↵
the beginning of the AI model's response turn or self-referential speech.
claude-4-5-haiku
↵<start_of_turn>model↵Hi! I'm Gemma,
words related to collective identity, belonging, and social groups.
claude-4-5-sonnet
!"↵*"Our five-year plan:
ideologically charged or controversial political and social viewpoints, particularly arguments from conservative, libertarian, or contrarian perspectives on contentious topics.
claude-4-5-haiku
" often state their goal is to celebrate traditional families and