Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

comment symbols

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_7/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Institutions

-0.08

 Institution

-0.08

Kis

-0.08

 Stiftung

-0.08

 institutions

-0.08

 Mosque

-0.07

эс

-0.07

әй

-0.07

 enzyme

-0.07

ाउंड

-0.07

POSITIVE LOGITS

 برابر

0.09

 yerine

0.08

 poput

0.08

tru

0.07

ూ

0.07

 gravit

0.07

 whilst

0.07

 ਦੀ

0.07

 violates

0.07

vyr

0.07

Activations Density 0.003%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact