© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
GPT2-Small
Transcoders Residuals
8-TRES-DC
95

INDEX

Explanations

occurrences of the word "Advertisement" in the text

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ctr

-0.89

reason

-0.67

Ranked

-0.67

cript

-0.64

icate

-0.63

enser

-0.63

ctor

-0.62

ayer

-0.61

ensibly

-0.60

mpeg

-0.59

POSITIVE LOGITS

 docking

0.67

DOI

0.64

 environment

0.64

 DRAGON

0.60

↵

0.60

elson

0.59

bed

0.59

<|endoftext|>

0.58

Adams

0.58

↵

0.57

Activations Density 0.008%

No Known Activations