Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

calculations and verification

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

(||

-0.08

-transform

-0.07

akin

-0.07

 essentiellement

-0.07

 xây

-0.07

_build

-0.07

 akin

-0.07

*:

-0.07

τερ

-0.07

(Add

-0.07

POSITIVE LOGITS

 nochmals

0.12

 novamente

0.11

再次

0.10

 weiterhin

0.10

 jeszcze

0.10

 erneut

0.10

 שוב

0.10

 nuevamente

0.09

 nochmal

0.09

 volled

0.09

Activations Density 0.076%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact