Neuronpedia
Get Started
API
Releases
Jump To
Search
Models
Assistant Axis
NEW
Circuit Tracer
NEW
Steer
SAE Evals
Exports
Community
Blog
Privacy & Terms
Contact
Sign In
© Neuronpedia 2025
Privacy & Terms
Blog
GitHub
Slack
Twitter
Contact
EXPLANATION TYPE
oai_token-act-pair
Description
OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
Author
OpenAI
URL
https://github.com/hijohnnylin/automated-interpretability
Settings
Default prompts from the main branch, strategy TokenActivationPair.
Recent Explanations
the neuron detects short cause-and-solution statement pairs phrased like "X is due to Y. The solution is to Z."
gpt-5-mini
the
grass
wet
is
to
wait
for
the
rain
to
stop
LLAMA3.1-8B-IT
7-RESID-POST-AA
INDEX 120611
mentions of “llama” (especially LLaMA-related model or library names) in text.
gpt-5
to
call
the
`
ll
ama
_model
.predict
()`
method
,
LLAMA3.1-8B-IT
7-RESID-POST-AA
INDEX 18961
This neuron responds to uppercase acronyms, initialisms, or all-caps letter sequences (capitalized token fragments).
gpt-5-mini
is
called
the
PRO
TECT
S
Initiative
,
PC
Magazine
reports
LLAMA3.1-8B-IT
7-RESID-POST-AA
INDEX 55221
It detects mentions of the Llama language model name (and its letter-case/variant tokenizations).
gpt-5-mini
am
based
on
the
L
lama
language
model
,
which
is
LLAMA3.1-8B-IT
7-RESID-POST-AA
INDEX 5159
lines that pose questions or question-heading phrases (especially starting with interrogative words like who/what/how/where).
gpt-5-mini
this
HS
code
?
**
↵↵
This
code
is
used
for
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 109552
tokens representing years, dates, or other multi-digit numeric sequences (e.g., "2023", "2015").
gpt-5-mini
(
2
0
2
2
US
Data
):
**
#
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 86344
mentions of political/government institutions, offices, and election/representation language (e.g., served, elected, assembly, presidency).
gpt-5-mini
*
(
land
owners
),
served
as
the
legislative
body
,
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 27686
tokens that are part of user instructions or explicit task/request prompts (i.e., directive phrases asking the model to do something).
gpt-5-mini
introduction
to
short
story
<end_of_turn>
↵
<start_of_turn>
model
↵
Okay
,
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 7908
the presence of numeric tokens and arithmetic/math expressions (numbers and computation-related symbols) in the text.
gpt-5-mini
8
5
3
4
4
=
8
5
3
GEMMA-3-27B-IT
53-GEMMASCOPE-2-RES-262K
INDEX 2974
Words that express strong negative impact, danger, or sensational severity (e.g., disaster/havoc/doom-type terms).
gpt-5-mini
path
ogen
that
wreak
s
havoc
in
the
livestock
industry
of
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 87140
It detects headings, titles, or other section-start/heading tokens that mark the start of a new block or prominent label.
gpt-5-mini
Life
is
noisy
and
confusing
↵
There
is
so
much
going
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 25109
This neuron detects questions—tokens and turns that are part of user (or conversational) interrogative utterances.
gpt-5-mini
to
happen
soon
?
<|im_end|>
↵
<|im_start|>
assistant
↵
As
an
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 61872
Tokens marking the assistant's reply (the assistant role / assistant message starts).
gpt-5-mini
<|im_end|>
↵
<|im_start|>
assistant
↵
Cl
aro
!
Aqu
i
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 85710
the neuron detects numeric tokens, especially years and other multi-digit dates/numbers.
gpt-5-mini
9
,
2
0
1
0
.
The
episode
was
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 92380
tokens that belong to the assistant's generated message or message/metadata markers (i.e., assistant-role and model-generated content).
gpt-5-mini
or
update
.↵↵
As
of
my
knowledge
cutoff
in
September
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 41407
Tokens that mark the assistant's turn/start of an assistant response (assistant-turn boundary).
gpt-5-mini
Barcelona
to
Moscow
<|im_end|>
↵
<|im_start|>
assistant
↵
The
quickest
and
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 75599
tokens that are part of the user's input (i.e., user-role prompt text).
gpt-5-mini
breast
expansion
story
.
<|im_end|>
↵
<|im_start|>
assistant
↵
Once
upon
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 1882
This neuron detects assistant-generated text (tokens marking the assistant's responses).
gpt-5-mini
?
<|im_end|>
↵
<|im_start|>
assistant
↵
Most
matters
that
are
commonly
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 21618
References to holidays, festivals, or seasonal/celebration-related terms (names of holidays, festival events, and related date/time words).
gpt-5-mini
by
again
tomorrow
.↵↵
April
Fool
’s
Day
↵
The
ultimate
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 23022
tokens that mark or occur inside the assistant's replies (the assistant speaker/response segments).
gpt-5-mini
Hi
llama
<|im_end|>
↵
<|im_start|>
assistant
↵
Hello
there
!
I
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 25757