Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

complex or good suggestions

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_16_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

했지만

0.76

 แต่

0.76

but

0.75

 nhưng

0.69

していますが

0.66

いましたが

0.63

但是我

0.62

뿐

0.62

 있지만

0.62

ですが

0.60

POSITIVE LOGITS

tiene

0.59

 त्याची

0.55

nj

0.55

 musí

0.54

pisah

0.54

 हैज

0.53

ters

0.52

 található

0.52

miştir

0.52

 می‌شود

0.52

Activations Density 1.089%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact