Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

ly

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

bg

-0.08

 elas

-0.08

=array

-0.08

Bart

-0.08

 Bakers

-0.08

erl

-0.08

nek

-0.08

mdl

-0.07

 turmoil

-0.07

loch

-0.07

POSITIVE LOGITS

 рекоменду

0.09

 outweigh

0.09

 recommend

0.09

认为

0.09

 рекомендуется

0.08

 encouraged

0.08

 urge

0.08

 believe

0.08

 urged

0.08

 recomand

0.08

Activations Density 0.019%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact