Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

math, numbers

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

BLOCK

-0.08

stil

-0.07

ollo

-0.07

BIG

-0.07

‌ర

-0.07

 típ

-0.07

Oma

-0.07

 stitch

-0.07

 primit

-0.07

 matk

-0.07

POSITIVE LOGITS

本人

0.10

 തന്നെ

0.09

 그대로

0.09

 original

0.08

 obviously

0.08

iginal

0.08

 തന്ന

0.08

ిందే

0.08

 itself

0.08

original

0.08

Activations Density 0.125%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact