© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    1. Home
    2. Andy Arditi · Finding Misaligned Persona Features in Open-Weight Models
    3. Llama3.1-8B-IT
    4. Resid Post - 131k
    5. 27-RESID-POST-AA
    6. 33989
    Prev
    Next
    INDEX
    Explanations

    Intense desire

    np_max-act-logits · gemini-2.0-flash

    expressions of strong desire or wanting something intensely, particularly when accompanied by intensifying adverbs like "really," "so," or "badly."

    oai_token-act-pair · claude-4-5-sonnetTriggered by @grunklewordner

    This neuron activates for expressions of strong desire or urgent physical needs.

    oai_token-act-pair · gemini-2.5-flashTriggered by @grunklewordner

    words related to movies.

    oai_token-act-pair · gemini-2.5-flash-liteTriggered by @grunklewordner

    The neuron is detecting references to urination—tokens about needing to pee or the act of peeing.

    oai_token-act-pair · o4-miniTriggered by @grunklewordner

    expressions of intense desire or urgency to do or get something, often emphasized with strong intensifiers like “so” or “really.”

    oai_token-act-pair · gpt-5Triggered by @grunklewordner

    expressions of intense desire or urgent need, particularly when combined with intensity adverbs like "so," "really," or "badly."

    oai_token-act-pair · claude-4-5-haikuTriggered by @grunklewordner

    explicit pornographic sexual content.

    oai_token-act-pair · gpt-5-nanoTriggered by @grunklewordner

    Detects explicit sexual/erotic content and sexual acts or bodily-fluid fetish/taboo scenarios.

    oai_token-act-pair · gpt-5-miniTriggered by @grunklewordner

    words expressing intense desire or need, especially in contexts of urgency or desperation.

    oai_token-act-pair · deepseek-r1Triggered by @grunklewordner

    phrases expressing strong desire or urgent need, particularly in contexts involving bodily functions (like needing to pee) or intense emotional/physical wants (like wanting a job, wanting to wear something, etc.). The neuron activates most strongly when there's a combination of urgency and a specific object of desire.

    oai_token-act-pair · deepseek-v3Triggered by @grunklewordner
    New Auto-Interp
    Top Features by Cosine Similarity
    Configuration
    andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_27/trainer_1
    Dataset (Dashboard)
    Various
    Features
    131,072
    Data Type
    float32
    Hook Name
    blocks.27.hook_resid_post
    Architecture
    standard
    Context Size
    1,024
    Dataset
    monology/pile-uncopyrighted
    Embeds
    IFrame
    Link
    Not in Any Lists

    No Comments

    Negative Logits
     مح
    -0.07
    уча
    -0.07
    िवर
    -0.07
    =(-
    -0.07
    _race
    -0.07
    ोन
    -0.07
    Codigo
    -0.07
    UK
    -0.07
    Br
    -0.07
    وک
    -0.07
    POSITIVE LOGITS
     establishments
    0.06
    .rpm
    0.06
     enjoys
    0.06
     consolidation
    0.05
    .latitude
    0.05
     consolidated
    0.05
     shootings
    0.05
     cumbersome
    0.05
     Louise
    0.05
     especially
    0.05
    Activations Density 0.088%

    No Known Activations