Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

1

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_15/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ETA

-0.08

Welcome

-0.08

Cue

-0.08

 Welcome

-0.08

 italiana

-0.08

ETA

-0.08

.but

-0.08

Peb

-0.07

idata

-0.07

WELCOME

-0.07

POSITIVE LOGITS

 فول

0.07

)"↵

0.07

 специальных

0.07

+m

0.07

خصوص

0.07

 перспектив

0.07

 உர

0.07

-fold

0.07

ক্ষম

0.07

 месяца

0.07

Activations Density 0.003%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact