Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

Country, month, scenario, Bush, modern

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_31_width_65k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

lja

0.49

 Telling

0.49

Hell

0.46

 films

0.46

ין

0.46

Α

0.45

 flora

0.45

তেও

0.44

咫

0.44

 decompose

0.44

POSITIVE LOGITS

\}$

0.51

 اُن

0.47

áng

0.47

*);

0.46

returnValue

0.46

 prevención

0.45

empê

0.45

车载

0.44

)_{

0.43

 ஊழிய

0.43

Activations Density 0.001%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact