Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

定位

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-gpt-oss-20b/resid_post_layer_15/trainer_0

Dataset (Dashboard)

Various

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

بارات

-0.09

 إلا

-0.09

èy

-0.09

 strategically

-0.08

 éagsúla

-0.08

asjoner

-0.08

 किंवा

-0.08

ثال

-0.08

 მაგალითად

-0.08

.completed

-0.08

POSITIVE LOGITS

是哪

0.10

 culp

0.10

定位

0.10

 blame

0.10

 دقیق

0.09

所属

0.09

究

0.08

 needle

0.08

 locating

0.08

 정확

0.08

Activations Density 0.006%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact