Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

:

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_31_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ot

0.52

lists

0.51

를

0.47

lista

0.46

enriched

0.45

 employment

0.45

lisher

0.44

list

0.43

 फ़ोन

0.43

dated

0.43

POSITIVE LOGITS

ازت

0.45

 schneller

0.44

 entweder

0.44

 стре

0.44

 stessi

0.43

ణ

0.43

逭

0.42

 стадии

0.42

Bxg

0.42

μπο

0.42

Activations Density 0.001%

No Known Activations

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact