Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

set notation examples

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-4b-it/resid_post/layer_9_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ка

2.58

ating

2.24

isted

2.21

 Resmi

2.19

้

2.19

ค์

2.17

га

2.15

 freshly

2.12

lijk

2.11

ates

2.07

POSITIVE LOGITS

ت

2.91

inputStream

2.82

ባድ

2.77

此之外

2.74

le

2.42

ल

2.41

minded

2.37

വ

2.34

Owned

2.32

ur

2.32

Activations Density 0.002%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact