Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

Mathematical computations

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_15/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 {↵

-0.08

render

-0.08

 koristiti

-0.07

NEW

-0.07

 rende

-0.07

':↵

-0.07

 bruker

-0.07

 {↵↵

-0.07

 CURRENT

-0.07

 render

-0.07

POSITIVE LOGITS

 apesar

0.09

 again

0.09

 있으

0.09

 despite

0.08

 bulun

0.08

こちら

0.08

 unavoidable

0.08

 awkward

0.08

 zwar

0.08

unh

0.08

Activations Density 0.016%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact