© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
GPT2-Small
Transcoders Residuals
8-TRES-DC
165

INDEX

Explanations

references to tangible goods or resources

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

xon

-0.90

entin

-0.77

insky

-0.70

rams

-0.70

nces

-0.69

acket

-0.68

alach

-0.68

instein

-0.67

bats

-0.66

igious

-0.66

POSITIVE LOGITS

istic

0.78

ize

0.68

istically

0.68

ocent

0.68

ista

0.65

izable

0.65

opolis

0.63

istas

0.63

opsy

0.61

ise

0.61

Activations Density 0.016%

No Known Activations