© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
GPT2-Small
Transcoders Residuals
8-TRES-DC
152

INDEX

Explanations

questions or phrases that inquire about types or categories of things

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ATIONS

-0.69

uble

-0.67

EF

-0.65

ELL

-0.65

Ess

-0.61

 Drift

-0.59

Et

-0.57

itations

-0.57

 THREE

-0.57

 Prelude

-0.55

POSITIVE LOGITS

of

0.98

of

0.93

luster

0.80

nesses

0.72

achu

0.69

icles

0.68

 thereof

0.67

OF

0.64

Of

0.64

oft

0.64

Activations Density 0.032%

No Known Activations