Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

in a statement

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_22/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

by

-1.59

 other

-1.54

 these

-1.42

it

-1.33

 each

-1.32

 그리고

-1.26

 both

-1.23

以及

-1.20

 setiap

-1.15

好評

-1.13

POSITIVE LOGITS

thenburg

1.34

ᾐ

1.32

 estavam

1.19

 spiega

1.16

 médec

1.16

 sidste

1.13

 parteci

1.11

 começaram

1.10

 gestes

1.10

缫

1.09

Activations Density 0.014%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact