Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

recomp

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 slaughter

-0.97

 Majefty

-0.96

Efq

-0.96

 pleaſure

-0.85

abestanden

-0.85

 protoimpl

-0.83

 fubject

-0.82

 Chriftian

-0.81

WebVitals

-0.80

 uſe

-0.80

POSITIVE LOGITS

er

0.78

em

0.68

time

0.68

0.67

0.61

age

0.61

house

0.61

0.61

ee

0.60

ed

0.59

Activations Density 0.108%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact