Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

only

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_31_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Ό

0.38

ImgBoard

0.37

ากหลาย

0.35

ామని

0.33

许多

0.32

ğer

0.32

มีการ

0.32

 délic

0.31

());

0.31

 ayaa

0.31

POSITIVE LOGITS

 only

1.13

 только

1.09

only

1.05

只限

1.03

 فقط

0.97

 Only

0.96

 ONLY

0.96

 tylko

0.95

 uniquement

0.94

 केवल

0.91

Activations Density 0.428%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact