Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

place names and places

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_31_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

debris

0.38

Slf

0.36

тков

0.36

Tri

0.35

っくり

0.35

 ஒப்பந்த

0.35

っぷ

0.34

啸

0.33

েন্ট

0.33

 вс

0.33

POSITIVE LOGITS

 seria

0.38

vasena

0.37

ﾞ

0.37

 llega

0.37

 서로

0.36

wany

0.36

 Prefecture

0.35

 여기가

0.34

 egiten

0.34

 প্রয

0.34

Activations Density 0.032%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact