Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

art and relating to others

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ritic

-0.08

engen

-0.08

MOM

-0.08

ಗೊಂಡ

-0.08

kele

-0.08

 Insurance

-0.08

yos

-0.07

 atol

-0.07

 jaarlijkse

-0.07

-0.07

POSITIVE LOGITS

 audiences

0.10

 audience

0.10

 الآخرين

0.10

 misunder

0.10

 followers

0.09

 ప్రేక్షక

0.09

大众

0.09

 الناس

0.09

 misunderstanding

0.09

 someone

0.09

Activations Density 0.117%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact