© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Gemma-2-27B
22-GEMMASCOPE-RES-131K
68441

INDEX

Explanations

self discovery

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Top Features by Cosine Similarity

Configuration

google/gemma-scope-27b-pt-res/layer_22/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

No Configuration Found

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

at

-1.94

in

-1.73

all

-1.70

 still

-1.58

 these

-1.55

 because

-1.52

 every

-1.48

by

-1.48

as

-1.48

 after

-1.47

POSITIVE LOGITS

為に

1.66

ほら

1.62

ところに

1.50

割と

1.45

ようで

1.42

あー

1.41

のエ

1.38

感じで

1.38

すっ

1.38

こいつ

1.36

Activations Density 1.600%

No Known Activations