© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    1. Home
    2. Residual Stream - Blog
    3. HeadVis, plus NLA Contributions and more SAEs
    Newsletter
    No spam, unsubscribe anytime.
    The Residual Stream

    The Residual Stream

    Neuronpedia Blog

    RSS

    The Babble (Deprecated)

    Podcast by NotebookLM

    Apple PodcastsSpotify PodcastsPodcast RSS
    HeadVis, plus NLA Contributions and more SAEs

    HeadVis, plus NLA Contributions and more SAEs

    Explore 36,000+ Attention Heads Across 37 Models
    By Johnny Lin, David Chanin · June 9th, 2026

    This post is human-written and not AI generated or edited.

    Hey all! We're introducing a new Neuronpedia core feature for exploring attention heads in collaboration with Anthropic, and we also provide a few updates to NLAs and SAEs, as well as a quick poll in the end.


    ➡️ Try HeadVis - Qwen 3.5 0.8B

    HeadVis (Luger, Kamath, et al) is Anthropic's tool for exploring attention heads, and it's now available for 37 models on Neuronpedia, for a total of 36,000+ attention head dashboards.

    Let's examine an induction head (an attention head responsible for repeating previously seen token patterns like [A][B]...[A]->[B]) in Qwen3.5-0.8B:

    Induction head in Qwen 3.5 0.8B

    In the sequence above, the species name "Glenea anticepunctata" is repeated multiple times. As we hover over the tokens in its name, the orange lines show the attention head looking back at previous copies of the species name (specifically, the token immediately after the previous instance of the current token) so it knows what to output next. The grey lines show the opposite: where the hovered token is being referenced by a future token.

    The rest of the attention dashboard shows other metrics and visualizations such as self attention score, max attenetion distribution, and top query and key tokens.

    Head Finder

    This Qwen 3.5 model has 24 layers and 8 heads per layer, for a total of 192 attention heads, so how did we find an induction head? The Head Finder (enabled by clicking "Finder" on an attention head dashboard) lets us find top N heads by notable, precomputed metrics. In this case, we filtered by top induction scores of all heads - calculated by averaging the induction-pattern attention values across many sequences.

    Here's the Head Finder in action:

    Head Finder for Qwen 3.5 0.8B

    Accessing & Sharing Attention Heads

    Since HeadVis is a core Neuronpedia integration, there are a few ways to access and share attention head dashboards:

    1. Model Page - There's now an "Attention Visualizer" panel on all model pages, which contains the whole HeadVis interface and finder. Model pages on Neuronpedia are in the URL format: neuronpedia.org/[modelId], like neuronpedia.org/qwen3.5-0.8b.
    2. Dropdowns - From any dashboard page or "jump to" panel on Neuronpedia, choose your model and select the "Attention Heads" release.
    3. Direct Links - Like feature dashboards, attention heads are directly shareable simply by copying its URL, which is in the format neuronpedia.org/[modelId]/head/[layer]/[head_index], like neuronpedia.org/qwen3.5-0.8b/head/4/5. Also like feature dashbarods, you can embed attention heads in an iframe with the embed=true query parameter.
    4. Exports - Attention head metrics and sequences based on the HeadVis specification are downloadable in our exports under [model]/headvis/[dataset_used]. For example, the HeadVis data for Gemma-3-27B-IT is available here. We used pile-uncopyrighted to generate all HeadVis data.

    Natural Language Autoencoders - Community Contributions

    Explaining Features with Foreign NLAs (Francesco Zaffino)

    Read Post (LessWrong) | Notebook

    Contributor Francesco Zaffino previously demoed (notebook) using Gemma's NLA to explain SAE features. His new post extends this experiment in two ways:

    • Cross-model NLA AV Explanations: Using one model's NLA to explain a different model's SAE features by first mapping activations from one model to another.
    • Improving NLA SAE Explanations: By making the SAE vector look more like a residual stream activation (via "washout"), the explanations tend to be less influenced by random model quirks.

    Check out the post and the associated notebook.

    NLA for Gemma 4 E2B (Caleb DeLeeuw)

    Contributor Caleb DeLeeuw is working on NLAs for Gemma 4 E2B, and has trained a few versions of it. Surprisingly, the NLAs were trained on a 4GB consumer GPU! These NLAs are still a work in progress, but there are two versions for experimentation:

    • v0.0.1: Gemma 4 E2B AV and AR
    • v0.1: Gemma 4 E2B AV and AR

    Example code for running these NLAs is available here.


    New SAE Wave (David Chanin & Decode)

    Combined with the new SAEs from the last newsletter, here are the 14 new SAEs available on Neuronpedia, all with auto-interp explanations and available via our exports.

    Our thanks to Modal for generously providing the compute used to train the Qwen 3.5 0.8B and 4B SAEs.

    ModelLayerLink
    Gemma 4 31B3030-res-matryoshka-131k
    Gemma 4 E2B1717-matryoshka-res-65k
    Gemma 4 E4B2121-matryoshka-res-65k
    Olmo 3 7B1616-res-matryoshka-65k
    Olmo 3 32B3232-res-batchtopk-131k
    Qwen 3 1.7B1414-resid-batchtopk-65k__l0-80
    Qwen 3 8B1818-resid-batchtopk-65k__l0-80
    Qwen 3 14B2020-resid-batchtopk-65k__l0-80
    Qwen 3 32B3232-resid-batchtopk-65k
    Qwen 3.5 0.8B1111-res-matryoshka-65k
    Qwen 3.5 2B Base1111-qwenscope-res-32k
    Qwen 3.5 4B1515-res-matryoshka-65k
    Qwen 3.5 9B Base1515-qwenscope-res-64k
    Qwen 3.5 27B3131-qwenscope-res-80k

    Poll: Which models do you (want to) use for research?

    ➡️ Poll Link

    Neuronpedia currently runs >20 models for live inference, steering, graph generation, circuit tracing, and now NLAs in the API and on our interface. Which of the models do you care about the most, and which models should we add? Please take a minute toe answer the one question poll so that we can prioritize the models that you care about the most.


    As always, please contact us with your questions, feedback, and suggestions.

    ← Back to BlogHome