Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

that

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

<bos>

-0.80

 الحره

-0.61

 initComponents

-0.57

."]

-0.57

.*")]

-0.56

.’”

-0.56

).]

-0.55

.'"

-0.54

])));

-0.54

>*/

-0.51

POSITIVE LOGITS

0.58

abella

0.57

 Brahma

0.55

 each

0.55

 Shiva

0.55

 base

0.53

mặt

0.52

mvn

0.52

 Bohr

0.52

lossians

0.52

Activations Density 0.003%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact