Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

"hurley and"

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-4b-it/transcoder_all/layer_11_width_262k_l0_small_affine

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

xavier

0.61

 اه

0.50

avik

0.49

荡

0.49

iania

0.48

illiard

0.47

றிய

0.46

шко

0.46

ز

0.46

IGENCE

0.45

POSITIVE LOGITS

 distintas

0.72

 oppression

0.69

 pathogenesis

0.65

 repressive

0.64

matmul

0.63

 patriarchal

0.63

 भागों

0.62

 deportivas

0.62

 oppressive

0.62

 asesinato

0.62

Activations Density 0.016%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact