Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

math symbols and formulas

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

erval

-0.08

анк

-0.08

 pato

-0.07

pecified

-0.07

VER

-0.07

 timid

-0.07

quisition

-0.07

 sluggish

-0.07

IAN

-0.07

IVAL

-0.07

POSITIVE LOGITS

 normalization

0.08

 còn

0.08

 оста

0.08

Normalization

0.07

 overpower

0.07

 existe

0.07

 normalize

0.07

 kosmet

0.07

 Σε

0.07

 आगे

0.07

Activations Density 0.016%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact