© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Qwen3-32B
32-RESID-BATCHTOPK-65K
8334

INDEX

Explanations

benefit and harm

np_acts-logits-general · gemini-2.5-flash-lite

New Auto-Interp

Top Features by Cosine Similarity

Configuration

adamkarvonen/qwen3-32b-saes/saes_Qwen_Qwen3-32B_batch_top_k/resid_post_layer_32

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

No Configuration Found

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

å¯¤

-0.09

 Appeal

-0.09

ords

-0.09

_singular

-0.08

å°ıé¸Ł

-0.08

 stylist

-0.08

(*(

-0.08

å¿Ĳ

-0.08

éħįå¥Ĺ

-0.08

å®ĺæĸ¹

-0.08

POSITIVE LOGITS

 harm

0.22

çĽĬ

0.21

æľīçĽĬ

0.19

åį±å®³

0.19

å®³

0.18

 Harm

0.16

çļĦåį±å®³

0.16

åĿıäºĭ

0.16

åıĹçĽĬ

0.16

 beneficial

0.16

Activations Density 0.109%

No Known Activations