© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Jacobian LensNEW

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
Qwen3-1.7B
26-LLAMASCOPE-2-LORSA-16K-K64
755

INDEX

Explanations

say "South Africa"

unknown · unknown

New Auto-Interp

Top Features by Cosine Similarity

Embeds

Show PlotsShow ExplanationShow ActivationsShow Test FieldShow SteerShow Link

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

常德

-25.50

 Fuji

-24.88

 Oregon

-24.38

Oregon

-24.13

Portland

-24.00

 Mosul

-23.63

岳阳

-23.63

 Istanbul

-23.00

洱

-22.50

昆明

-22.00

POSITIVE LOGITS

南非

52.00

 South

50.00

South

46.50

 Johannesburg

43.00

 apartheid

39.50

SA

39.00

 SOUTH

38.00

 Cape

33.75

 Mandela

33.25

南

33.00

Activations Density 0.289%

No Known Activations