Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

gelatin or echo

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_31_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 majest

0.44

 vorbere

0.44

rollen

0.43

큽

0.42

 berg

0.41

0.40

旮

0.40

 efficacement

0.40

 distortions

0.39

 understandings

0.39

POSITIVE LOGITS

CATEGORY

0.51

 સારી

0.49

 ম্যানেজার

0.48

 सामान्य

0.48

superclass

0.47

 સાર

0.46

 trabajador

0.46

 районе

0.46

 时尚

0.46

ٻ

0.44

Activations Density 0.001%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact