Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

specific nouns

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_22/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

发育

-1.12

tamientos

-1.02

麻辣

-1.02

Kig

-1.02

┤

-1.00

polish

-0.99

膵

-0.98

edoria

-0.97

⫻

-0.97

 tiek

-0.96

POSITIVE LOGITS

 these

1.27

Também

1.24

或者

1.23

这些

1.22

 потому

1.19

而

1.19

 этих

1.18

いただきたい

1.15

But

1.13

Appellee

1.12

Activations Density 0.055%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact