Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

arao

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

iffance

-1.68

æus

-1.61

encils

-1.52

ipheral

-1.51

æa

-1.48

ratulations

-1.47

hematical

-1.43

iſt

-1.42

ugeot

-1.39

othesis

-1.38

POSITIVE LOGITS

if

0.89

0.86

un

0.84

il

0.83

one

0.78

any

0.78

of

0.77

for

0.76

my

0.76

uk

0.75

Activations Density 0.690%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact