Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

is

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 നടക്കുന്ന

-0.08

 ನಡೆಸ

-0.08

优势

-0.08

actics

-0.08

 നടത്തുന്ന

-0.07

有哪些

-0.07

 chlor

-0.07

 toot

-0.07

投入

-0.07

发展的

-0.07

POSITIVE LOGITS

æt

0.09

לים

0.08

boxed

0.07

 Fondation

0.07

oui

0.07

forth

0.07

_ans

0.07

YES

0.07

.Ans

0.07

_register

0.07

Activations Density 0.028%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact