© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
GPT2-Small
Transcoders Residuals
8-TRES-DC
635

INDEX

Explanations

references to U.S. states and their respective legislative or social actions

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

pit

-0.74

arten

-0.70

 Skies

-0.70

IFE

-0.68

undy

-0.68

izons

-0.67

raine

-0.67

arden

-0.66

unal

-0.66

awan

-0.65

POSITIVE LOGITS

 encrypt

0.74

 tacit

0.72

 scrut

0.72

 deft

0.69

 quietly

0.68

 detected

0.67

 attributes

0.67

hiba

0.67

 brisk

0.67

 invented

0.66

Activations Density 0.237%

No Known Activations