© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
GPT2-Small
Transcoders Residuals
8-TRES-DC
633

INDEX

Explanations

references to things that are unusual or unconventional

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Against

-0.66

 Workshop

-0.63

 Painting

-0.63

 tailor

-0.62

Close

-0.62

acist

-0.61

orah

-0.59

 towed

-0.59

andan

-0.58

 Another

-0.58

POSITIVE LOGITS

ball

0.72

iversary

0.69

bj

0.68

eah

0.68

balls

0.66

amount

0.65

 omission

0.64

ities

0.63

 acron

0.63

 distribut

0.62

Activations Density 0.120%

No Known Activations