© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Gemma-2-2B
0-CLT-HP
90768

INDEX

Explanations

steps

np_max-act · gemini-2.0-flash

New Auto-Interp

Top Features by Cosine Similarity

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

No Configuration Found

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 methods

-1.55

 steps

-1.49

methods

-1.44

Methods

-1.41

 Methods

-1.38

 strategies

-1.38

steps

-1.33

METHODS

-1.26

 METHODS

-1.22

Efq

-1.21

POSITIVE LOGITS

to

0.81

 that

0.68

for

0.66

and

0.65

0.64

0.57

you

0.56

of

0.55

—

0.53

 piernas

0.53

Activations Density 0.038%

No Known Activations