© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Qwen3-1.7B
26-LLAMASCOPE-2-LORSA-16K-K64
638

INDEX

Explanations

say "India"

unknown · unknown

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

漳

-21.88

SAC

-19.13

SZ

-19.00

连云港

-19.00

TAS

-18.75

濠

-18.50

 Gors

-18.00

SZ

-18.00

徐州

-18.00

SAL

-17.88

POSITIVE LOGITS

 Indian

61.00

Indian

59.25

印度

58.25

 India

57.00

India

54.25

 Indians

50.50

 indian

48.75

インド

41.50

 india

41.00

杭州

40.75

Activations Density 0.281%

No Known Activations