© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
GPT2-Small
Transcoders Residuals
8-TRES-DC
192

INDEX

Explanations

repeated usage of the verb "do" in various contexts

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 appet

-0.68

 dynam

-0.67

 stead

-0.64

 replication

-0.64

 stimul

-0.64

 demolition

-0.60

 sleeper

-0.60

 tranquil

-0.59

syn

-0.59

 reson

-0.58

POSITIVE LOGITS

 Malf

0.83

ammy

0.81

ffe

0.72

orf

0.72

dc

0.69

pose

0.68

FFER

0.67

rant

0.67

ented

0.66

ensed

0.65

Activations Density 0.054%

No Known Activations