© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    1. Home
    2. Gemma-3-27B-IT
    3. 3-GEMMASCOPE-2-TRANSCODER-262K
    4. 124883
    Prev
    Next
    INDEX
    Explanations

    * **TOP_POSITIVE_LOGITS**: `for`, `ről`, `ur`, `ués`, `u`, `ed`, `ië`, `iencia`, `kannya`, `naires` * These look like suffixes or parts of words in various languages, particularly Romance languages (e.g., `iencia`, `ués`, `ur`, `urricular`). This list hints at word endings or common word fragments.4. **TOP_ACTIVATING_TEXTS**: * "oto Kan** (June 2010 – September 2011) 8. **Yoshihiko Noda** (September 2011 – December 2012) 9." * "Yoshihiko Noda" is a major Prime Minister of Japan. * "**Romance:** Present, but not a central focus and very clean. * **Hunter x Hunter (Yoshihiro Togashi):** (36+ Volumes, ongoing, but with frequent hiatuses) *Vibe Check:" * "Yoshihiro Togashi" is the creator of Hunter x Hunter. * "river is known as the habitat of sparrows. Once in three years, the sparrows from distant mountain areas in Tan" * "Sumiyoshi river" - `river` appears. * "coffee liqueur (like Kahlua) or dark rum (optional, but recommended!) * **For the Mascarpone Cream:** * 6 large egg yolks * 3/4 cup (150g)" * "Cream" appears.**Pattern Analysis:*** **MAX_ACTIVATING_TOKENS**: `oshi`, `oshi`, `oshi`. This strongly points to the name component "Yoshi".* **TOKENS_AFTER_MAX_ACTIVATING_TOKEN**: `hiro`, `h`. * `oshi` + `hiro` -> `Yoshihiro` (Yoshihiro Togashi) * `oshi` + `h` -> `Yoshihiko` (Yoshihiko Noda)* The combination of `oshi` and the subsequent likely name components (`hiro`, `h`) clearly indicates names starting with "Yoshi".* The `TOP_POSITIVE_LOGITS` `ur`, `ués`, `iencia`, `u`, `ed`, `ië`, `kannya`, `naires` and the presence of `river` and `Cream` in the `TOP_ACTIVATING_TEXTS` are less directly tied to the "Yoshi" pattern. However, `river` appears after `pone` in `TOKENS_AFTER_MAX_ACTIVATING_TOKEN`, and `Cream` appears after `pone` as well. The `TOP_POSITIVE_LOGITS` might be general word fragments or suffixes activated in other contexts by this neuron, but the `oshi` pattern is dominant and specific.* The `TOP_POSITIVE_LOGITS` also show fragments that could be relevant to names or titles in different languages or contexts (e.g., `ur` in names like Sura, `ués` ending in Spanish likeíguez).**Conclusion:**The most prominent and specific pattern is the recognition of Japanese names starting with "Yoshi" by combining `oshi` with following syllables like `hiro` or `h`. The other elements might represent secondary or less frequent activation patterns not as strongly represented in the provided snippets. The `TOP_POSITIVE_LOGITS` are quite diverse, some looking like common word endings or parts in various languages, which could mean the neuron recognizes common linguistic structures or parts of words in a broader sense, but the `oshi` pattern is the most striking.Given the rules:- Keep explanation concise (3-20 words).- Find patterns.- Specific.The specific pattern is names starting with "Yoshi".japanese names starting with yoshi

    np_acts-logits-general · gemini-2.5-flash-lite
    New Auto-Interp
    Top Features by Cosine Similarity
    Configuration
    google/gemma-scope-2-27b-it/transcoder_all/layer_3_width_262k_l0_small_affine
    Prompts (Dashboard)
    238,145 prompts, 512 tokens each
    Dataset (Dashboard)
    lmsys + oasst1
    No Configuration Found
    Embeds
    IFrame
    Link
    Not in Any Lists

    No Comments

    Negative Logits
     CBSE
    0.71
    ある
    0.66
    说
    0.64
    ’
    0.64
     BJP
    0.63
     Cortes
    0.62
     Católica
    0.62
     Beginn
    0.61
    かかる
    0.58
    द्
    0.57
    POSITIVE LOGITS
     for
    1.19
    ről
    0.75
    ur
    0.70
    ués
    0.70
    u
    0.70
    ed
    0.69
    ië
    0.66
    iencia
    0.65
    kannya
    0.64
    naires
    0.64
    Activations Density 0.000%

    No Known Activations