Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

Russian negation particle "Не"

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_10/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

𖤍

-2.56

 anderen

-2.44

.”

-2.42

Ꝑ

-2.42

 arbeta

-2.38

ᨆ

-2.34

玘

-2.34

琊

-2.31

豋

-2.30

of

-2.28

POSITIVE LOGITS

ization

2.39

 They

2.38

 Не

2.28

↵

2.28

 mereka

2.27

Ꭳ

2.25

身体

2.22

It

2.19

 Ві

2.17

There

2.16

Activations Density 0.002%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact