Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

using glue as adhesive

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_22/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-2.64

 meeste

-2.64

᎑

-2.56

泺

-2.48

-2.45

</i>

-2.42

 Это

-2.31

 egregious

-2.30

 meisten

-2.30

 работа

-2.30

POSITIVE LOGITS

鷥

2.41

 serão

2.27

 verlangen

2.25

 marinho

2.25

 gewi

2.20

 gewor

2.11

眎

2.06

圌

2.05

胧

2.05

 Pág

2.05

Activations Density 0.029%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact