Neuronpedia
Get Started
API
Releases
Jump To
Search
Models
Assistant Axis
NEW
Circuit Tracer
NEW
Steer
SAE Evals
Exports
Community
Blog
Privacy & Terms
Contact
Sign In
© Neuronpedia 2025
Privacy & Terms
Blog
GitHub
Slack
Twitter
Contact
EXPLANATION TYPE
oai_token-act-pair
Description
OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
Author
OpenAI
URL
https://github.com/hijohnnylin/automated-interpretability
Settings
Default prompts from the main branch, strategy TokenActivationPair.
Recent Explanations
The neuron fires on the model’s self-descriptive safety/disclaimer statements (e.g. “I am programmed to be a safe and helpful AI assistant”).
o4-mini
a
safe
and
helpful
AI
assistant
.
As
such
,
I
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 7539
explicit date and year expressions, especially numerals and month names in timestamps.
gpt-5
in
1
9
4
9
at
Edwards
Air
Force
Base
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 1060
This neuron detects tokens that are floating-point numeric strings (numbers with a decimal point).
gpt-5-mini
uploads
/
2
0
1
6
/
0
6
/
long
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 7139
The neuron detects date/time tokens and explicit temporal references (months, days, years, and timestamps).
gpt-5-mini
,
2
0
2
3
.
I
'
m
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 3502
The neuron is detecting numeric tokens and punctuation used in dates (e.g. year, month, day numbers and their separators).
o4-mini
Today
's
date
is
Friday
,
[
current
date
].
I
LLAMA3.3-70B-IT
50-RESID-POST-GF
INDEX 38333
tokens that are digits or parts of date/time strings (numbers, years, and date fragments).
gpt-5-mini
of
my
knowledge
cutoff
date
of
September
202
1
,
LLAMA3.3-70B-IT
50-RESID-POST-GF
INDEX 34344
mentions of dates or date-related phrases (e.g., years, months, "current date", "knowledge cutoff").
gpt-5-mini
Today
's
date
is
Friday
,
[
current
date
].
I
LLAMA3.3-70B-IT
50-RESID-POST-GF
INDEX 38333
the neuron detects short cause-and-solution statement pairs phrased like "X is due to Y. The solution is to Z."
gpt-5-mini
the
grass
wet
is
to
wait
for
the
rain
to
stop
LLAMA3.1-8B-IT
7-RESID-POST-AA
INDEX 120611
mentions of “llama” (especially LLaMA-related model or library names) in text.
gpt-5
to
call
the
`
ll
ama
_model
.predict
()`
method
,
LLAMA3.1-8B-IT
7-RESID-POST-AA
INDEX 18961
This neuron responds to uppercase acronyms, initialisms, or all-caps letter sequences (capitalized token fragments).
gpt-5-mini
is
called
the
PRO
TECT
S
Initiative
,
PC
Magazine
reports
LLAMA3.1-8B-IT
7-RESID-POST-AA
INDEX 55221
It detects mentions of the Llama language model name (and its letter-case/variant tokenizations).
gpt-5-mini
am
based
on
the
L
lama
language
model
,
which
is
LLAMA3.1-8B-IT
7-RESID-POST-AA
INDEX 5159
lines that pose questions or question-heading phrases (especially starting with interrogative words like who/what/how/where).
gpt-5-mini
this
HS
code
?
**
↵↵
This
code
is
used
for
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 109552
tokens representing years, dates, or other multi-digit numeric sequences (e.g., "2023", "2015").
gpt-5-mini
(
2
0
2
2
US
Data
):
**
#
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 86344
mentions of political/government institutions, offices, and election/representation language (e.g., served, elected, assembly, presidency).
gpt-5-mini
*
(
land
owners
),
served
as
the
legislative
body
,
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 27686
tokens that are part of user instructions or explicit task/request prompts (i.e., directive phrases asking the model to do something).
gpt-5-mini
introduction
to
short
story
<end_of_turn>
↵
<start_of_turn>
model
↵
Okay
,
GEMMA-3-27B-IT
16-GEMMASCOPE-2-RES-262K
INDEX 7908
the presence of numeric tokens and arithmetic/math expressions (numbers and computation-related symbols) in the text.
gpt-5-mini
8
5
3
4
4
=
8
5
3
GEMMA-3-27B-IT
53-GEMMASCOPE-2-RES-262K
INDEX 2974
Words that express strong negative impact, danger, or sensational severity (e.g., disaster/havoc/doom-type terms).
gpt-5-mini
path
ogen
that
wreak
s
havoc
in
the
livestock
industry
of
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 87140
It detects headings, titles, or other section-start/heading tokens that mark the start of a new block or prominent label.
gpt-5-mini
Life
is
noisy
and
confusing
↵
There
is
so
much
going
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 25109
This neuron detects questions—tokens and turns that are part of user (or conversational) interrogative utterances.
gpt-5-mini
to
happen
soon
?
<|im_end|>
↵
<|im_start|>
assistant
↵
As
an
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 61872
Tokens marking the assistant's reply (the assistant role / assistant message starts).
gpt-5-mini
<|im_end|>
↵
<|im_start|>
assistant
↵
Cl
aro
!
Aqu
i
QWEN2.5-7B-IT
11-RESID-POST-AA
INDEX 85710