Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

this

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

are

-0.55

-0.51

-0.50

It

-0.50

‘

-0.48

This

-0.47

’

-0.47

There

-0.46

-0.46

In

-0.45

POSITIVE LOGITS

 snippetHide

1.25

expandindo

1.20

 Савезне

1.05

 للمعارف

1.05

ftagPool

1.02

تقاوى

1.00

 defaultstate

1.00

IsMutable

0.98

 Administrativna

0.96

 myſelf

0.96

Activations Density 0.209%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact