Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

gnome obsessed

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-1b-pt/resid_post/layer_13_width_16k_l0_medium

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

راض

1.97

ament

1.91

rain

1.68

ریض

1.64

ंपरा

1.64

enche

1.63

istice

1.62

𝚘

1.60

𝚛

1.58

驕

1.57

POSITIVE LOGITS

י

3.01

2.80

ところに

2.51

ি

2.45

sley

2.41

ARKS

2.40

erful

2.38

Rounded

2.30

som

2.29

加坡

2.28

Activations Density 1.844%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact