Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

code punctuation and string parts

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_22/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

or

-2.52

 those

-2.50

on

-2.39

an

-2.34

 When

-2.33

-2.33

to

-2.31

 What

-2.31

if

-2.30

 even

-2.30

POSITIVE LOGITS

沵

3.17

 boister

2.55

摀

2.42



2.33

⑊

2.33



2.33

 délais

2.28



2.27

 WHICH

2.25

 appeler

2.25

Activations Density 0.003%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact