© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
GPT2-Small
Transcoders Residuals
8-TRES-DC
333

INDEX

Explanations

the term "great" in various contexts

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

¯¯¯¯¯¯¯¯

-0.66

RAG

-0.63

カ

-0.62

�

-0.60

urat

-0.60

chenko

-0.59

Cub

-0.58

sure

-0.58

aterial

-0.57

Kry

-0.57

POSITIVE LOGITS

anwhile

0.89

theless

0.73

abouts

0.68

icides

0.65

stores

0.65

arth

0.63

drivers

0.63

street

0.62

venient

0.61

erity

0.61

Activations Density 0.069%

No Known Activations