Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

the

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 casual

-0.08

ọ

-0.07

HOM

-0.07

 несов

-0.07

Chrom

-0.07

 चोट

-0.07

_spec

-0.07

 घंट

-0.07

 división

-0.07

 circonst

-0.07

POSITIVE LOGITS

成功

0.15

 exitos

0.15

 exemplary

0.15

 Successful

0.14

Successful

0.14

 সফল

0.14

 successful

0.14

 ಯಶ

0.14

 విజయ

0.14

successful

0.14

Activations Density 0.173%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact