Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

hyphens

np_max-act-logits · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-12b-pt/resid_post/layer_41_width_65k_l0_medium

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

쿱

0.51

मेरा

0.50

तीर्ण

0.49

民間

0.48

 দুজন

0.47

ўся

0.46

িশালী

0.46

 ಒಳಗ

0.46

இதில்

0.45

Turkish

0.45

POSITIVE LOGITS

1.75

-,

1.44

`-

1.39

-/

1.27

"-

1.26

'-

1.17

()-

1.15

“-

1.11

°-

1.11

$-

1.09

Activations Density 0.027%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact