© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Qwen3-1.7B
26-LLAMASCOPE-2-LORSA-16K-K64
680

INDEX

Explanations

say yellow

unknown · unknown

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

.pg

-19.75

Gos

-17.38

 PostgreSQL

-17.25

rove

-17.00

rc

-16.50

Nar

-16.13

 cigar

-16.00

搪

-15.13

SAS

-15.00

笮

-15.00

POSITIVE LOGITS

黄

54.00

黃

52.25

 yellow

45.25

�

43.50

Yellow

43.50

黄色

43.50

 Yellow

43.00

 YELLOW

39.25

-yellow

38.00

yellow

36.25

Activations Density 0.132%

No Known Activations