AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders ｜ Neuronpedia

© Neuronpedia 2026

Privacy & Terms Blog GitHub Slack Twitter Contact

Neuronpedia

Natural Language

NEW Assistant AxisNEW Circuit TracerUPDATESteer SAE Evals ExportsAPI Community Blog Privacy & Terms Contact

Home
AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders

AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders

pyvene.ai, The Stanford NLP Group

·github.com ↗

axbench

Jump To

Jump to Source/SAE

Jump to Feature

INDEX

Random Feature

Search Explanations

Browse

Features in GEMMA-2-9B-IT@20-axbench-reft-r1-res-16k

Hover over a feature on the left to preview its details.
Click a feature to lock it and interact with it.