© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    1. Home
    2. Gemma-3-27B-IT
    3. 31-GEMMASCOPE-2-TRANSCODER-262K
    4. 77553
    Prev
    Next
    INDEX
    Explanations

    Okay, I will analyze the provided lists to find a concise explanation for the neuron's behavior.1. **MAX_ACTIVATING_TOKENS**: `of`, `facts`, `Facts`2. **TOKENS_AFTER_MAX_ACTIVATING_TOKEN**: `<start_of_turn>`, `History`, `:**`, `3`3. **TOP_POSITIVE_LOGITS**: `Facts`, `trivia`, `статистики`, `facts`, `statistiques`, `fakta`, `Trivia`, `Facts`, `Trivia`, `statistic`4. **TOP_ACTIVATING_TEXTS**: Seem to be requests or responses containing "fun facts", "shock me with facts", "fun fact of the day", "cool fact", "random but interesting fact", "random fact", and also mentions of "Pizza facts", "Listverse", "Fun Facts", and "random trivia".**Analysis**:- `MAX_ACTIVATING_TOKENS` clearly points to "facts" and "Facts".- `TOP_POSITIVE_LOGITS` reinforces this with "Facts", "facts", "Trivia", "statistic", and foreign language equivalents. This neuron is strongly associated with factual information.- `TOP_ACTIVATING_TEXTS` show contexts like "tell me some fun facts", "shock me with facts", "fun fact of the day", "cool fact", "random fact", and headings like "**Pizza facts**", "**A Little Bit of History & Culture**", "**Fun Facts**". The pattern is about presenting or requesting factual information, trivia, or interesting tidbits.- `TOKENS_AFTER_MAX_ACTIVATING_TOKEN` is a bit noisy. However, `History` appearing after `Facts` in "Facts History" could suggest a link to historical facts. The repetition of `<start_of_turn>` might indicate it triggers at the start of a response that delivers facts.**Pattern Identification**:The neuron appears to activate when the topic is about presenting or asking for facts, trivia, or interesting pieces of information. This is often signaled by the word "facts" or "trivia". The `TOKENS_AFTER_MAX_ACTIVATING_TOKEN` combined with `TOP_ACTIVATING_TEXTS` suggests it might also be sensitive to the *delivery* of these facts, potentially in a structured or informative context, sometimes related to lists or history.Given the strong emphasis on "facts" and "trivia" in both `MAX_ACTIVATING_TOKENS` and `TOP_POSITIVE_LOGITS`, and the contextual examples in `TOP_ACTIVATING_TEXTS`, the core theme is factual information. The mention of "History" and the structure of some examples (like lists of facts) add nuance.Let's try to distill this into a concise phrase.It's about facts, trivia, and interesting pieces of information."facts and trivia" is too simple, but captures the essence."presenting or requesting facts" is also good.Considering the examples: "fun facts", "cool fact", "random fact", "Pizza facts", "Fun Facts", and "random trivia", the neuron is strongly tied to providing or seeking knowledge snippets.A good concise phrase could be "facts and trivia".However, `MAX_ACTIVATING_TOKENS` included "of", which is often part of phrases like "a lot of facts".The `TOKENS_AFTER_MAX_ACTIVATING_TOKEN` lists `<start_of_turn> History :** 3 :**`. This suggests that after a potential trigger word like "facts", the neuron might be looking for specific content *following* it, perhaps indicating a context of presenting a fact or trivia.Let's re-evaluate:`MAX_ACTIVATING_TOKENS`: `of`, `facts`, `Facts``TOKENS_AFTER_MAX_ACTIVATING_TOKEN`: `<start_of_turn>`, `History`, `:**`, `3`This combination, especially `Facts` followed by `History` or `3`, suggests it might be related to *types* of facts or *sources* of facts.The `TOP_POSITIVE_LOGITS` contain `статистики` and `statistiques`, which means statistics.The `TOP_ACTIVATING_TEXTS` contain "fun facts", "cool fact", "random fact", "Pizza facts", and mentions of "Listverse".The neuron is definitely about factual content. The prompt asks for what it "detects or predicts by finding patterns in lists".facts and trivia

    np_acts-logits-general · gemini-2.5-flash-lite
    New Auto-Interp
    Top Features by Cosine Similarity
    Configuration
    google/gemma-scope-2-27b-it/transcoder_all/layer_31_width_262k_l0_small_affine
    Prompts (Dashboard)
    238,145 prompts, 512 tokens each
    Dataset (Dashboard)
    lmsys + oasst1
    No Configuration Found
    Embeds
    IFrame
    Link
    Not in Any Lists

    No Comments

    Negative Logits
     latt
    0.35
    {[
    0.34
    ૂત
    0.33
    鞄
    0.33
    {\
    0.33
    ມື
    0.32
    íqu
    0.32
     chromatic
    0.31
     {[
    0.31
     screened
    0.31
    POSITIVE LOGITS
     Facts
    0.47
     trivia
    0.45
     статистики
    0.45
     facts
    0.44
     statistiques
    0.44
     fakta
    0.44
    Trivia
    0.43
    Facts
    0.42
     Trivia
    0.41
     statistic
    0.41
    Activations Density 0.005%

    No Known Activations