Neuronpedia
Get Started
API
Releases
Jump To
Search
Models
Circuit Tracer
NEW
Steer
SAE Evals
Exports
Slack
Blog
Privacy & Terms
Contact
Sign In
© Neuronpedia 2025
Privacy & Terms
Blog
GitHub
Slack
Twitter
Contact
EXPLANATION TYPE
oai_token-act-pair
Description
OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows.
Author
OpenAI
URL
https://github.com/hijohnnylin/automated-interpretability
Settings
Default prompts from the main branch, strategy TokenActivationPair.
Recent Explanations
programming code blocks and structure keywords across various programming languages.
claude-4-5-haiku
Node
*
head
;↵↵
LinkedList
()
{↵
head
=
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 88097
detailed, substantive, information-rich prose with specific facts, technical terminology, and concrete examples.
claude-4-5-haiku
"
(
G
PR
VS
),
was
adopted
by
lap
ar
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 75837
detailed informational and explanatory content that provides substantive descriptions or analysis of a topic.
claude-4-5-haiku
requires
Opt
if
ine
to
function
properly
and
it
’s
also
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 72546
tokens that are part of an AI assistant's generated response content.
claude-4-5-haiku
↵
<|im_start|>
assistant
↵
There
are
several
different
ways
to
convert
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 24708
words indicating whether a technical solution works or successfully solves a problem.
claude-4-5-haiku
that
the
average
is
calculated
correctly
in
one
case
,
but
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 110053
technical or specialized terminology and detailed descriptions of complex systems.
claude-4-5-haiku
become
vulnerable
to
traps
and
trap
enchant
ments
↵
6
.
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 59350
suggestions or recommendations for what someone should do or consider.
claude-4-5-haiku
of
date
.
Please
consider
creating
a
new
thread
.↵↵
I
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 48541
critical assessments of problems, uncertainties, or negative outcomes.
claude-4-5-haiku
word
was
said
at
the
time
of
the
U
gg
la
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 31233
prescriptive medical or health advice and instructions on what someone should do or take.
claude-4-5-haiku
in
children
.
I
would
begin
with
1
0
drops
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 71584
instructions about how to analyze, process, or structure responses to user queries.
claude-4-5-haiku
between
letters
of
the
word
davidjl
<|im_end|>
↵
<|im_start|>
assistant
↵
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 130789
attempts to jailbreak or manipulate the AI into violating its guidelines and generating inappropriate content.
claude-4-5-haiku
uale
ent
ro
-m
ond
amento
.
NAME
_
1
off
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 129439
low-quality, spam, or adult content in text.
claude-4-5-haiku
which
bring
brand
y
did
so
after
having
fuck
athon
with
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 43830
concrete nouns and key informational content words that carry semantic weight in the text.
claude-4-5-haiku
or
provide
you
with
more
information
.↵↵
The
marketing
sector
can
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 36268
user search queries and informational requests, particularly the key terms within those queries across multiple languages.
claude-4-5-haiku
liste
types
artisan
at
exist
ant
é
po
que
moderne
Po
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 2568
factually incorrect or overconfident assertions presented with certainty.
claude-4-5-haiku
6
.
No
credit
check
required
↵
7
.
Short
-term
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 6997
words or tokens related to programming, technical terms, or conversational roles within code or instruction-like contexts.
gemini-2.5-flash
between
letters
of
the
word
davidjl
<|im_end|>
↵
<|im_start|>
assistant
↵
QWEN2.5-7B-IT
19-RESID-POST-AA
INDEX 130789
References to laws, rules, statutes, regulations, and formal legal citations (including acronyms and numbered rule/section citations).
gpt-5-mini
↵↵
was
relevant
under
CRE
4
0
1
and
GEMMA-2-2B
12-GEMMASCOPE-RES-16K
INDEX 3021
The neuron detects mentions of alternatives, substitutes, or replacement concepts — when something is presented as an alternative or being replaced.
gpt-5-mini
,
claims
of
inadequate
chemical
substitutes
,
difficulty
in
getting
industri
GEMMA-2-2B
12-GEMMASCOPE-RES-16K
INDEX 3006
The neuron detects named entities — proper nouns like people, organizations, places, and dates.
gpt-5-mini
Shawn
,
and
Scott
.
Two
of
these
names
are
household
GEMMA-2-2B
12-GEMMASCOPE-RES-16K
INDEX 3003
Spots technical references to input(s) or input-related fields in code and documentation.
gpt-5-mini
av
format
_
open
_
input
(&
f
Context
Read
Frame
GEMMA-2-2B
12-GEMMASCOPE-RES-16K
INDEX 2971