Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

engaging community

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-4b-it/resid_post/layer_22_width_16k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Tweet

1.06

 commas

1.00

PRI

0.97

Cookie

0.96

दा

0.94

Ț

0.93

Domain

0.93

 einzelnen

0.93

Genre

0.91

τες

0.90

POSITIVE LOGITS

subunit

1.12

 подготовки

1.08

verständ

1.06

inputStream

1.06

 constructively

1.04

ውን

1.03

峹

1.02

 persiapan

1.02

激发

1.01

 passionately

1.01

Activations Density 0.000%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact