Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

given

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_11/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

://'

-0.09

cobra

-0.08

 estimates

-0.08

 realizing

-0.08

 realizes

-0.08

 realization

-0.08

 steep

-0.07

 настоя

-0.07

jg

-0.07

 realized

-0.07

POSITIVE LOGITS

DSL

0.09

ודית

0.09

ിരിക്കുന്ന

0.07

 conversational

0.07

 בער

0.07

 Binnen

0.07

 ideological

0.07

 Ausstattung

0.07

wię

0.07

ורג

0.07

Activations Density 0.008%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact