© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Qwen3-1.7B
27-LLAMASCOPE-2-LORSA-16K-K64
16322

INDEX

Explanations

say "rehab"

unknown · unknown

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

pne

-19.25

 yağ

-18.75

pine

-18.25

洋葱

-17.25

queen

-17.13

茄子

-17.13

ilan

-17.00

コピー

-17.00

Pir

-17.00

.inflate

-16.50

POSITIVE LOGITS

 rehab

22.25

 Recovery

19.88

 recovery

19.88

康复

18.75

 addict

18.38

复发

18.00

戒

17.88

 Recover

17.75

 addiction

17.63

 Addiction

17.63

Activations Density 0.163%

No Known Activations