© Neuronpedia 2026
Privacy & Terms
Blog
GitHub
Slack
Twitter
Contact
Neuronpedia
Natural Language
Autoencoders
NEW
Assistant Axis
NEW
Circuit Tracer
UPDATE
Releases
Jump To
Search
Models
Steer
SAE Evals
Exports
Guides
API
Community
Blog
Privacy & Terms
Contact
Sign In
EXPLANATION TYPE
oai_token-act-pair
Description
OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
Author
OpenAI
URL
https://github.com/hijohnnylin/automated-interpretability
Settings
Default prompts from the main branch, strategy TokenActivationPair. Uses top 10 deduplicated activations.
Recent Explanations
future intentions or plans.
gemini-2.5-flash-lite
on
a
vacation
that
was
planned
months
ago
.\n\n
I
have
LLAMA3.3-70B-IT
50-RESID-POST-GF
INDEX 1029
the numeral 7, especially within phone numbers or numerical strings.
gpt-5
4
3
-
5
6
7
8
)
https
://
GEMMA-3-27B-IT
57-GEMMASCOPE-2-TRANSCODER-262K
INDEX 37411
I cannot determine what this neuron is looking for based on the provided data, as it shows zero activation values across all tokens in every example.
claude-4-5-haiku
4
3
-
5
6
7
8
)
https
://
GEMMA-3-27B-IT
57-GEMMASCOPE-2-TRANSCODER-262K
INDEX 37411
phrases related to conflicts of interest.
gemini-2.5-flash-lite
'
s
integrity
or
creating
conflicts
of
interest
with
portfolio
companies
GEMMA-3-27B-IT
46-GEMMASCOPE-2-TRANSCODER-262K
INDEX 66235
tokens related to the structure of arguments, specifically premises and conclusions.
gemini-2.5-flash-lite
argument
is
invalid
↵↵↵
This
argument
is
valid
↵↵↵
The
conclusion
GEMMA-3-27B-IT
37-GEMMASCOPE-2-TRANSCODER-262K
INDEX 136881
coding-related tokens such as `password`, numerical digits, and technical terms like `bcrypt` and `admin`.
gemini-2.5-flash-lite
.
hash
('
password
1
2
3
'),
#
Store
GEMMA-3-27B-IT
32-GEMMASCOPE-2-TRANSCODER-262K
INDEX 235328
section headers and formatting common in structured AI-generated text.
gemini-2.5-flash-lite
,
Ud
io
,
R
iff
usion
**
-
Generate
music
GEMMA-3-27B-IT
1-GEMMASCOPE-2-TRANSCODER-262K
INDEX 106170
informal terms of address and conversational interjections.
gemini-2.5-flash-lite
<bos>
<start_of_turn>
user
↵
Yo
man
do
you
know
anything
about
GEMMA-3-27B-IT
31-GEMMASCOPE-2-TRANSCODER-262K
INDEX 51375
phrases related to building structures or physical connections.
gemini-2.5-flash-lite
,
man
nimmt
das
Gan
ze
Wasser
auf
der
Welt
und
GEMMA-3-27B-IT
6-GEMMASCOPE-2-TRANSCODER-262K
INDEX 132247
specific named components or parts within a description.
gemini-2.5-flash-lite
What
parts
make
it
up
?
↵
*
**
Functions
GEMMA-3-27B-IT
13-GEMMASCOPE-2-TRANSCODER-262K
INDEX 43897
present tense verbs ending in 'ing'.
gemini-2.5-flash-lite
-
shaped
areas
designed
to
accommodate
larger
components
.
The
GEMMA-3-27B-IT
13-GEMMASCOPE-2-TRANSCODER-262K
INDEX 11888
text describing creative work or fictional characters.
gemini-2.5-flash-lite
is
a
creature
of
whims
y
and
contradiction
.
Centuries
of
GEMMA-3-27B-IT
2-GEMMASCOPE-2-TRANSCODER-262K
INDEX 110105
the concept of collapse, particularly in the context of wave functions or physical processes.
gemini-2.5-flash-lite
и
за
ње
га
,
исте
жу
ћи
в
рат
да
GEMMA-3-27B-IT
50-GEMMASCOPE-2-TRANSCODER-262K
INDEX 197953
tandem or paired systems.
gemini-2.5-flash-lite
efficiency
even
further
(
tand
em
cells
).
↵
*
GEMMA-3-27B-IT
2-GEMMASCOPE-2-TRANSCODER-262K
INDEX 159486
phrases related to community and togetherness.
gemini-2.5-flash-lite
=
5
↵
y
=
"
Hello
"
↵
print
GEMMA-3-27B-IT
24-GEMMASCOPE-2-TRANSCODER-262K
INDEX 178045
phrases related to user input and model responses in a conversational AI context.
gemini-2.5-flash-lite
you
like
-
for
example
,
under
1
8
,
GEMMA-3-27B-IT
10-GEMMASCOPE-2-TRANSCODER-262K
INDEX 48493
specific named entities, often locations or organizations, along with descriptive terms.
gemini-2.5-flash-lite
Indust
ri
equ
art
ier
-
Industrial
Quarter
):
**
↵↵
GEMMA-3-27B-IT
51-GEMMASCOPE-2-TRANSCODER-262K
INDEX 68194
phrases and concepts related to nuclear technology and its negative consequences.
gemini-2.5-flash-lite
deeply
conflicted
and
largely
negative
.
Here
'
s
a
breakdown
GEMMA-3-27B-IT
18-GEMMASCOPE-2-TRANSCODER-262K
INDEX 88529
phrases related to language and grammar precision.
gemini-2.5-flash-lite
are
spelled
the
same
way
throughout
a
document
,
that
dates
GEMMA-3-27B-IT
10-GEMMASCOPE-2-TRANSCODER-262K
INDEX 27171
question-answering interactions.
gemini-2.5-flash-lite
*
**
Why
don
'
t
we
dry
out
?
**
GEMMA-3-27B-IT
20-GEMMASCOPE-2-TRANSCODER-262K
INDEX 208875