Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

`#include` directives

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_34/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 usize

-0.83

฿

-0.79

assium

-0.79

Vec

-0.74

cT

-0.73

されます

-0.72

と言っても

-0.71

⌀

-0.71

ishable

-0.70

ㄣ

-0.70

POSITIVE LOGITS

[].

0.84

 ทอง

0.80

滚

0.77

滾

0.75

 Mehl

0.71

 matic

0.70

 rowIndex

0.70

 bronchial

0.70

Joke

0.70

しば

0.70

Activations Density 0.043%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact