Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

to

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Veuillez

-0.07

ాయి

-0.07

ાયક

-0.07

rę

-0.07

.Please

-0.07

Donc

-0.07

 ezért

-0.07

↵

-0.07

 nouvelle

-0.07

 thereby

-0.07

POSITIVE LOGITS

 Lastly

0.11

 тағы

0.11

 naman

0.11

 lastly

0.10

 miscellaneous

0.10

another

0.10

 ebenfalls

0.09

또

0.09

 మరో

0.09

Lastly

0.09

Activations Density 0.378%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact