Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

text

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 owning

-0.09

 полный

-0.08

sons

-0.08

 sons

-0.08

 buitenlandse

-0.08

versicherung

-0.08

vaz

-0.07

 wasted

-0.07

 unnecessary

-0.07

基金

-0.07

POSITIVE LOGITS

字幕

0.12

Throughout

0.11

 Throughout

0.10

 स्क्रीन

0.10

 throughout

0.10

 появляется

0.09

 serif

0.09

 लोगो

0.09

 subtitles

0.09

 captions

0.09

Activations Density 0.009%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact