Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

historic locations

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Helpful

-0.09

 Helpful

-0.08

XOR

-0.08

 معدل

-0.08

ICS

-0.08

 iris

-0.08

OPT

-0.08

 hilfreich

-0.07

 helpful

-0.07

 simpt

-0.07

POSITIVE LOGITS

历史

0.19

 history

0.16

歷

0.16

 ইতিহাস

0.16

 geschiedenis

0.16

 इतिहास

0.16

 ചരിത്ര

0.16

 historiques

0.16

 historical

0.16

 историю

0.15

Activations Density 0.031%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact