© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
GPT2-Small
Transcoders Residuals
8-TRES-DC
543

INDEX

Explanations

expressions of desire or reluctance

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ahime

-0.98

eday

-0.76

acha

-0.68

neau

-0.67

iffe

-0.65

adden

-0.64

abin

-0.63

aban

-0.62

agne

-0.62

esar

-0.61

POSITIVE LOGITS

 anybody

0.75

 congr

0.75

 anyone

0.71

 unanswered

0.69

 residents

0.67

 unsupported

0.67

 either

0.66

olding

0.64

rians

0.64

 mortals

0.62

Activations Density 0.048%

No Known Activations