Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

th

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

th

-0.81

er

-0.72

 Eras

-0.68

ev

-0.68

 تضيفلها

-0.62

Th

-0.60

No

-0.60

rd

-0.60

pt

-0.59

bus

-0.58

POSITIVE LOGITS

 prêtres

0.59

aarrggbb

0.57

SequentialGroup

0.56

posób

0.56

charpe

0.54

nourriture

0.54

Erreferentziak

0.51

mặt

0.48

 rapporti

0.48

Còn

0.47

Activations Density 0.106%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact