© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Andy Arditi · GPT-OSS BatchTopK SAEs
GPT-OSS-20B
Resid Post - 131k
23-RESID-POST-AA
49245

INDEX

Explanations

say soda

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Top Features by Cosine Similarity

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_23/trainer_0

Dataset (Dashboard)

Various

No Configuration Found

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 drinks

-0.09

 Drinks

-0.09

 beverages

-0.08

 ontbij

-0.08

dr

-0.08

Dr

-0.08

 sauce

-0.08

onado

-0.08

ouns

-0.08

 Nación

-0.08

POSITIVE LOGITS

 sparkling

0.15

 club

0.15

 Club

0.14

Club

0.14

 क्लब

0.13

club

0.13

 hielo

0.12

 mynta

0.12

 mudd

0.12

 carbonation

0.12

Activations Density 0.022%

No Known Activations