Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

poster

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_10/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ázaro

-2.25

釆

-2.25

rália

-2.16

糅

-2.14

籐

-2.13

estão

-2.11

 समीक्षाएं

-2.06

ugais

-2.06

 infamous

-2.05

 solidly

-2.03

POSITIVE LOGITS

3.70

3.03

an

2.97

︲

2.77

er

2.77

2.63

龶

2.45

the

2.42

がんば

2.41

まさかの

2.39

Activations Density 0.011%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact