Neuronpedia

Neuronpedia

APICircuit TracerNEW Steer SAE Evals Exports Slack Blog Privacy & Terms Contact

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact

EXPLANATION TYPE

np_token-act-pair-logits

Description

OpenAI's Automated Interpretability from paper "Language models can explain neurons in language models". Modified by Johnny Lin to add new models/context windows. Newer modifications May 2025: Show model the top positive logits, and ask model to be more concise and omit things like "phrases related to...".

Author

OpenAI

URL

https://github.com/hijohnnylin/automated-interpretability

Settings

Modified version of OpenAI's token activation pair. Modifications: show model the top positive logits, and ask model to be more concise and omit things like "phrases related to...".

Recent Explanations

this neuron activates for languages other than English

gemini-2.0-flash

in the world, and drops Reenie at home,

16-GEMMASCOPE-TRANSCODER-16K

Although

gemini-2.0-flash

in science and technology. Although the specific effects of sequestration

16-GEMMASCOPE-TRANSCODER-16K

Mathematical notation

gemini-2.0-flash

0 = |u\rangle |0\rangle \

16-GEMMASCOPE-TRANSCODER-16K

short code snippets

gemini-2.0-flash

* Converts the type and the subtype of the parsed media

16-GEMMASCOPE-TRANSCODER-16K

network

gemini-2.0-flash

ierkiewicz v. Sorema↵N. A.,

16-GEMMASCOPE-TRANSCODER-16K

data and anonymity

gemini-2.0-flash

quantitatively the same results (data are not shown).↵↵

16-GEMMASCOPE-TRANSCODER-16K

dates and numerical ranges

gemini-2.0-flash

4 months from June to October 2018

16-GEMMASCOPE-TRANSCODER-16K

Legal citations

gemini-2.0-flash

Kan. 77, 79,

16-GEMMASCOPE-TRANSCODER-16K

capitalized words in sports articles

gemini-2.0-flash

Cureton sparked a second half comeback to pull out a

16-GEMMASCOPE-TRANSCODER-16K

code

gemini-2.0-flash

scientists, this is first discovery in Southeast Asiaon Pla

16-GEMMASCOPE-TRANSCODER-16K

gibberish/randomness

gemini-2.0-flash

can't." A rumbling, huge memory from the

16-GEMMASCOPE-TRANSCODER-16K

code

gemini-2.0-flash

-----------------\n" <<↵ "Array Size (

16-GEMMASCOPE-TRANSCODER-16K

technical/mechanical descriptions

gemini-2.0-flash

rod handle in positive angular engagement with each other about a

16-GEMMASCOPE-TRANSCODER-16K

seemingly random tokens in running text

gemini-2.0-flash

Turkmenistan national team beat the tournament hosts Nepal (0–

16-GEMMASCOPE-TRANSCODER-16K

our/us and nearby words

gemini-2.0-flash

↵hassles↵↵Visit our Services Page to see a

16-GEMMASCOPE-TRANSCODER-16K

seemingly random text and code snippets

gemini-2.0-flash

White supervisor out of the equation, especially, and next

16-GEMMASCOPE-TRANSCODER-16K

words present in questions and video game reviews

gemini-2.0-flash

of horror for me. Horror is about the creeping shiver

16-GEMMASCOPE-TRANSCODER-16K

Unclear, but it may be a period followed by one or more common words or characters used in HTML

gemini-2.0-flash

child-friendly events.↵↵We are still in the

16-GEMMASCOPE-TRANSCODER-16K

unclear

gemini-2.0-flash

machine all the time. “They help each other out

16-GEMMASCOPE-TRANSCODER-16K

the letter [any capital letter]

gemini-2.0-flash

the sample size and the letter indicates whether the sample had

16-GEMMASCOPE-TRANSCODER-16K