Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    APISteerSAE EvalsBlog/PodcastNEWSlackPrivacy & TermsContact
    © Neuronpedia 2025
    Privacy & TermsBlog/PodcastGitHubSlackTwitterContact
    1. Home
    2. Dunefsky · Chlenski · Transcoders Enable Fine-Grained Interpretable Circuit Analysis
    3. GPT2-Small
    4. Transcoders Residuals
    5. 1-TRES-DC
    6. 4066
    Prev
    Next
    INDEX
    Explanations

    the word "wasn't" and its variations

    oai_token-act-pair · gpt-4o-miniTriggered by @bot
    New Auto-Interp
    Top Features by Cosine Similarity
    Embeds
    IFrame
    Link
    Not in Any Lists

    No Comments

    Negative Logits
     Pigs
    -0.78
     Powered
    -0.77
    birds
    -0.71
    ONS
    -0.71
     Dise
    -0.67
    dragon
    -0.66
     coefficients
    -0.66
     dates
    -0.65
    pell
    -0.65
    planes
    -0.65
    POSITIVE LOGITS
    amacare
    0.90
    uala
    0.87
    ten
    0.83
    �
    0.82
    nel
    0.73
    gan
    0.72
    tyard
    0.70
    rane
    0.70
    ts
    0.69
    eday
    0.68
    Activations Density 0.018%

    No Known Activations