Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

Bal

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Sen

-0.52

sh

-0.50

bal

-0.49

sen

-0.49

Bal

-0.48

es

-0.47

Arma

-0.46

Bal

-0.46

sa

-0.46

fjspx

-0.45

POSITIVE LOGITS

ſelf

1.00

 myſelf

0.91

ſelves

0.87

ing

0.86

 uſed

0.86

 itſelf

0.84

 ſta

0.82

 uſe

0.81

 preſent

0.78

 pleaſure

0.78

Activations Density 0.093%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact