Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

Okay, you want to run

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_40_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

甑

0.38

श्यक

0.38

ὐ

0.37

 ज्ञ

0.37

 الانترنت

0.37

ഹ്ലാ

0.37

ുട

0.37

 인터넷

0.37

ജ്യ

0.37

na

0.36

POSITIVE LOGITS

 articol

0.40

rosis

0.37

 possible

0.36

смо

0.35

due

0.35

زم

0.35

ሪ

0.35

LU

0.34

ନ୍

0.34

 कर्

0.33

Activations Density 0.000%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact