Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

percent

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

."));

-1.39

.";

-1.27

".

-1.23

%");

-1.21

.");

-1.16

"]);

-1.16

 }}$}

-1.15

")));

-1.14

.",

-1.13

"])

-1.13

POSITIVE LOGITS

↵↵

0.81

The

0.71

0.69

 Sugar

0.61

DockStyle

0.61

↵

0.59

At

0.58

In

0.57

 However

0.56

For

0.56

Activations Density 0.140%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact