© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Andy Arditi · GPT-OSS BatchTopK SAEs
GPT-OSS-20B
Resid Post - 131k
11-RESID-POST-AA
112660

INDEX

Explanations

Calculation questions

np_max-act · gemini-2.0-flash

New Auto-Interp

Top Features by Cosine Similarity

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

No Configuration Found

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

irin

-0.08

 ctor

-0.07

lette

-0.07

vald

-0.07

>::

-0.07

iros

-0.07

 mers

-0.07

ação

-0.07

拔

-0.07

atórias

-0.07

POSITIVE LOGITS

 lágr

0.09

 lacag

0.08

 എണ്ണം

0.08

 мае

0.08

 severely

0.08

 কাপ

0.08

 সে

0.08

页

0.08

 significantly

0.08

 deren

0.08

Activations Density 0.212%

No Known Activations