Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

implementing actions

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-4b-it/resid_post/layer_9_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

☞

2.51

ਰ

2.24

 overw

2.24

是在

2.20

দের

2.19

 équip

2.18

식

2.17

лки

2.14

gaw

2.13

ThreadPool

2.12

POSITIVE LOGITS

ead

2.23

ൺ

2.17

is

2.16

عاد

2.16

ig

2.11

ем

2.09

غ

2.08

ль

2.04

ار

2.01

tad

1.98

Activations Density 0.075%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact