© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Qwen3-1.7B
27-LLAMASCOPE-2-LORSA-16K-K64
15512

INDEX

Explanations

say "self"

unknown · unknown

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

濉

-19.75

Commercial

-19.50

fcc

-19.25

lyr

-19.00

 subsidy

-18.88

 bureaucr

-18.75

 Commercial

-18.50

FileStream

-18.13

 policy

-17.88

 commercial

-17.75

POSITIVE LOGITS

意图

18.88

对自己的

18.00

钻石

17.75

滑

17.63

这个人

15.56

猩

15.38

认知

15.19

让自己

15.06

↵

15.06

自我

15.00

Activations Density 0.602%

No Known Activations