AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders
    pyvene.ai, The Stanford NLP Group
    axbench

    Jump To

    Jump to Source/SAE
    Jump to Feature
    INDEX
    Random Feature

    Search Explanations