Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

speech

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

utilisons

-0.69

 itſelf

-0.60

 ―――――

-0.59

issaient

-0.58

ftagPool

-0.58

 Monfieur

-0.57

 gehör

-0.57

 cdti

-0.56

ValueGeneration

-0.56

soever

-0.55

POSITIVE LOGITS

...

0.63

0.60

....

0.54

on

0.54

0.53

Feb

0.53

…

0.52

 this

0.51

...

0.50

..

0.49

Activations Density 0.055%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact