Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

that

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Perez

-0.08

-0.08

-guid

-0.08

 используется

-0.08

 Jacobs

-0.08

ículo

-0.08

ország

-0.07

 대한민국

-0.07

 ensin

-0.07

Horiz

-0.07

POSITIVE LOGITS

 wanting

0.12

 want

0.11

Want

0.10

 Want

0.10

 muốn

0.10

 souhaitent

0.10

 ønsker

0.09

 ترغب

0.09

 хотят

0.09

 غواړي

0.09

Activations Density 0.073%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact