Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

code documentation or formatting

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-4b-pt/resid_post/layer_29_width_16k_l0_medium

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ℝ

1.51

 Sincerely

1.47

 Oops

1.45

 因此

1.44

 ****",

1.42

 Remarkably

1.41

 ***",

1.39

[{\

1.39

 SmackDown

1.38

 Aquí

1.36

POSITIVE LOGITS

1.64

•

1.48

1.48

$\

1.43

https

1.42

“

1.40

http

1.35

1.34

1.32

«

1.30

Activations Density 0.052%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact