Neuronpedia

APIAssistant AxisNEW Circuit TracerNEW Steer SAE Evals Exports Community Blog Privacy & Terms Contact

INDEX

Explanations

privacy and time

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_31_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ადგენ

0.46

RELATIVA

0.46

ريبي

0.45

 moeil

0.44

 instancia

0.43

скохозяй

0.42

 TRANSPORTURI

0.42

Ꮡ

0.42

říve

0.42

 esposo

0.42

POSITIVE LOGITS

for

0.67

 glazing

0.51

for

0.50

set

0.50

 call

0.50

but

0.49

ید

0.49

 naming

0.49

 plaid

0.49

 supernatural

0.48

Activations Density 0.000%

No Known Activations

© Neuronpedia 2025

Privacy & Terms Blog GitHub Slack Twitter Contact