Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

good for

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-12b-it/resid_post/layer_12_width_16k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ü

1.54

を実現

1.43

。

1.39

৫৫

1.33

蚓

1.32

ce

1.29

ś

1.29

を楽し

1.28

০১

1.28

aj

1.27

POSITIVE LOGITS

ن

2.06

و

1.53

ா

1.47

ात

1.37

ή

1.30

ні

1.29

ные

1.26

ாப்

1.25

 tamaño

1.24

ture

1.21

Activations Density 0.169%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact