Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

action or state followed by context

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_22/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

one

-1.08

 necessary

-1.07

 something

-1.05

七十

-1.05

でしたね

-1.05

んだよね

-1.00

 that

-0.99

Only

-0.99

طع

-0.98

éton

-0.96

POSITIVE LOGITS

the

1.47

を

1.19

ytä

1.07

essä

1.05

の

1.03

OTROS

1.02

ae

1.01

镰

1.00

 wordt

0.99

bub

0.96

Activations Density 0.073%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact