Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

names and places

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_16_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 più

0.61

 tärke

0.61

 beeinfl

0.56

 mitä

0.54

 поддержка

0.54

 как

0.54

ಚ್

0.54

Ｒ

0.54

 seperti

0.53

 öst

0.53

POSITIVE LOGITS

 robbers

0.62

 majest

0.53

 handsome

0.48

 robberies

0.48

 bandits

0.47

 carelessly

0.47

 waiters

0.46

 cheques

0.46

 arrog

0.46

 mutton

0.46

Activations Density 0.001%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact