Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

_handler

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 standardized

-0.10

_pic

-0.09

guy

-0.08

 ekip

-0.08

pep

-0.08

-0.08

 presentation

-0.08

Presentation

-0.07

 बेट

-0.07

 iceberg

-0.07

POSITIVE LOGITS

(/^

0.10

 Expertise

0.09

 spezialisiert

0.09

/\.

0.08

 Trigger

0.08

 experts

0.08

 uzman

0.08

 эксперт

0.08

 gjelder

0.08

 décl

0.07

Activations Density 0.001%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact