Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

enables or allows

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-4b-it/resid_post/layer_22_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 تھے۔

1.02

։

0.99

 }}$.

0.98

 تھیں۔

0.91

തെന്നും

0.85

 ہیں۔

0.85

être

0.84

“.

0.84

}$.

0.84

.].

0.82

POSITIVE LOGITS

 helps

3.04

 creates

2.64

 enables

2.63

 allows

2.60

 reduces

2.58

 gives

2.56

 brings

2.52

 eliminates

2.46

 promotes

2.42

 enhances

2.42

Activations Density 0.506%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact