© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Qwen3-1.7B
27-LLAMASCOPE-2-LORSA-16K-K64
15429

INDEX

Explanations

say "consent"

unknown · unknown

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Kills

-17.63

CString

-17.00

烟囱

-16.25

_internal

-16.13

<tag

-15.94

umeric

-15.56

椒

-15.44

_dense

-15.38

 internal

-15.13

keleton

-15.06

POSITIVE LOGITS

 Consent

24.25

cons

23.38

Cons

22.75

 consent

22.75

-cons

22.38

 consenting

22.00

 cons

21.25

.Cons

20.75

自愿

20.00

 Cons

19.00

Activations Density 0.109%

No Known Activations