Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

exclamatory punctuation

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-1b-pt/resid_post/layer_13_width_16k_l0_medium

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

4.39

ness

4.39

了

4.35

nya

4.27

います

3.98

ন

3.95

র

3.91

으로

3.91

ों

3.82

side

3.80

POSITIVE LOGITS

eeee

4.79

eee

4.25

urope

4.17

iros

4.03

3.97

न्द्रीय

3.93

이션

3.86

יים

3.83

क्स्ट

3.83

anwhile

3.72

Activations Density 4.396%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact