© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
GPT2-Small
Transcoders Residuals
8-TRES-DC
41

INDEX

Explanations

expressions of strong emotional states

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

avior

-0.72

士

-0.71

}}}

-0.66

oldown

-0.66

ドラ

-0.65

Export

-0.64

aviour

-0.63

��

-0.63

 overshadow

-0.63

魔

-0.62

POSITIVE LOGITS

 thankful

0.99

 grateful

0.98

 glad

0.96

 impressed

0.93

 thrilled

0.93

 pleased

0.93

 saddened

0.90

 delighted

0.90

 sorry

0.89

 proud

0.89

Activations Density 0.091%

No Known Activations