Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

the same

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_34/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

つまり

-1.13

conian

-1.03

あまり

-0.96

ﻳ

-0.95

rola

-0.94

the

-0.92

でしたが

-0.92

しかも

-0.92

 perhaps

-0.91

jenie

-0.91

POSITIVE LOGITS

 same

3.75

 же

1.84

 gleichen

1.60

 stessa

1.48

 stesso

1.33

 mesma

1.33

 gleiche

1.31

 ίδ

1.28

 stesse

1.25

 mismos

1.23

Activations Density 0.052%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact