Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

но

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

可靠

-0.08

 plumbing

-0.08

éin

-0.08

 renewable

-0.08

伙伴

-0.07

xref

-0.07

 samm

-0.07

 betrouwbare

-0.07

PSL

-0.07

 prü

-0.07

POSITIVE LOGITS

 captivated

0.10

 unforgettable

0.09

 allure

0.08

 subconscious

0.08

想着

0.08

 impeccable

0.08

 الصغيرة

0.08

 unut

0.08

 greeted

0.08

 decreased

0.08

Activations Density 0.004%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact