© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Qwen3-1.7B
27-LLAMASCOPE-2-LORSA-16K-K64
15625

INDEX

Explanations

say "password"

unknown · unknown

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

沱

-17.50

emit

-16.88

 emit

-16.63

 Burl

-16.50

 Emit

-16.38

ARB

-16.13

�

-16.00

öl

-15.94

 PROCUREMENT

-15.88

	emit

-15.69

POSITIVE LOGITS

 password

43.75

 passwords

42.00

 Password

40.50

password

40.00

密码

39.50

Password

38.75

.Password

38.00

	password

37.75

.password

36.25

=password

36.25

Activations Density 0.126%

No Known Activations