EXPLANATION TYPE
oai_token-act-pair
Description
OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
Author
OpenAI
URL
https://github.com/hijohnnylin/automated-interpretabilitySettings
Default prompts from the main branch, strategy TokenActivationPair.
Recent Explanations
The neuron activates when the model begins or signals that it is compiling or preparing a detailed list or multi-item response (phrases like "Okay, compiling a list...").
gpt-5-mini
<start_of_turn>model↵Okay, compiling a list of directors (
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 30047
a location/tourist-place detector that activates on place names and mentions of cities, districts, or landmarks in travel itinerary text.
gpt-5-mini
: Arrival & Shinjuku (F, L,
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 18890
The neuron strongly responds to tokens containing the substring "act" (e.g., act, actuated, actuation, etc.).
gpt-5-mini
a highly realistic, functional quadruped cheetah suit, prioritizing
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 138203
references to setting or updating a GameObject's transform/position in Unity code (e.g., assignments using transform.position or new Vector3).
gpt-5-mini
cube.transform.position = new Vector3(
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 1256
This neuron detects template fields or signature/title labels—tokens that mark places for a person's title, role, or document section headings in business/email templates.
gpt-5-mini
Name]↵[Your Title]↵```↵↵*
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 17677
This neuron detects numeric tokens and number words—markers of counts, list items, or numbered sequences.
gpt-5-mini
description (at least a few sentences) and *n
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 939
the neuron activates on short technical identifiers/labels — e.g., code/package/entity names, acronyms, and UI/control identifiers.
gpt-5-mini
↵↵"""↵↵package anu is↵↵constant m:
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 46318
This neuron does not respond to any tokens — it remains inactive and does not detect any pattern.
gpt-5-mini
bidet attachment (see section IV) to reduce TP
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 510
the neuron detects section headings or emphasized technical/title tokens (tokens that appear in headers, bolded/important terms, or list/section titles).
gpt-5-mini
and then trains the model to *generate* those masked
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 41787
References to cooking appliances, burners, and fuel-related equipment (mentions of stoves, grills, and their fuel/ignition components).
gpt-5-mini
multi-fuel. Single burner or double.)↵*
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 41787
This neuron detects references to elections and voter participation (mentions of voting, turnout, and election-related metrics).
gpt-5-mini
only approved candidates. Turnout was mandated, and "
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 510
the neuron detects named entities/proper nouns (multi-token spans like place, organization, or technology names).
gpt-5-mini
. **Saskatchewan** - Known for its prairies
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 438
This neuron activates on tokens containing the substring "quant" (e.g., "quant", "Quant", "Quantum"), i.e., references to quantitative/quant/quantum.
gpt-5-mini
↵Give a five years quant analysis curriculum and include a
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 120295
The neuron detects major section headings and structural document markers (e.g., numbered or bolded section titles and topic headers).
gpt-5-mini
. Social Reasons & Maintaining Relationships**↵↵Humans are social
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 3647
tokens that never activate — an effectively inactive neuron.
gpt-5-mini
0 words (depending on the version
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 4222
tokens that indicate continuation or ongoing/continuous action (words signaling something continues).
gpt-5-mini
collaborative environment where I can continue to learn and grow as
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 3647
mentions of technical, domain-specific concepts or jargon (specialized terms like "serverless", "application-specific", product/architecture names, or technical keywords).
gpt-5-mini
not *optimized* for the specific, repetitive calculations needed
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 103733
tokens that mark gradual erosion or diminution of something (phrases describing wearing away or chipping away at a person, object, or state).
gpt-5-mini
his victories, relentlessly chipping away at her composure.
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 41723
The neuron primarily detects the token "happy" (and close variants/uses of "Happy") — i.e., expressions of happiness/positive greetings.
gpt-5-mini
is: Happy<end_of_turn>↵<start_of_turn>model↵Okay, here
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 143759
the neuron detects isolated single-character tokens (single letters or initials) across scripts.
gpt-5-mini
ักษรที่พัฒนามาจากอักษรไทย (ซึ่ง
GEMMA-3-27B-IT
31-GEMMASCOPE-2-RES-262K
INDEX 138853