© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    1. Home
    2. Gemma-3-12B
    3. 24-GEMMASCOPE-2-RES-16K
    4. 16012
    Prev
    Next
    INDEX
    Explanations

    The neuron appears to be associated with listing or formatting, particularly after common words or numbers. Observing the `TOKENS_AFTER_MAX_ACTIVATING_TOKEN` list, I see items like bullets (`*`), numbers, formatting indicators (`E`, `t`), and structural words (`response`, `method`). The `TOP_ACTIVATING_TEXTS` also contains lists (bullet points, numbered lists) and structured content (WikiHow article steps, features of products). The `TOP_POSITIVE_LOGITS` being German words suggests the neuron might also be active in non-English contexts or when language switching occurs, but the primary visible pattern across all lists relates to enumeration and structured content.Given the instructions to be concise (3-20 words) and to find patterns in lists:- `MAX_ACTIVATING_TOKENS`: 'the', '1', '0'.- `TOKENS_AFTER_MAX_ACTIVATING_TOKEN`: '*', '0', '**', 'E', 't', 'response', 'method', '3', '*'. This strongly suggests list items, bullet points, or structural elements following other tokens.- `TOP_ACTIVATING_TEXTS`: Many examples show numbered lists, bulleted lists, or sequential steps/items.- `TOP_POSITIVE_LOGITS`: German words. This could imply the neuron activates for non-English text, or specific types of German constructs. However, the other lists point more strongly to a structural pattern.The pattern that seems most consistent across the activating texts and the tokens following the max activating tokens is list formatting or enumeration. The German part is a bit of a wildcard or secondary signal.Let's try to formulate a concise explanation:1. **list items or formatting**: This is a direct observation.2. **enumeration and bullet points**: More specific about the type of lists.3. **structured lists and formatting**: Broadens it slightly.4. **German language and list structure**: Combines both observed aspects.The prompt says "you should look at the lists and find a pattern that helps you explain the behavior of the neuron." And "The explanation could be about tokens following or preceding certain tokens." or "The explanation could be about words starting with a sequence."The top activating texts show many examples of lists. The `TOKENS_AFTER_MAX_ACTIVATING_TOKEN` shows bullet points and structural elements, often following the `MAX_ACTIVATING_TOKENS` which are just 'the' or numbers.The German words in `TOP_POSITIVE_LOGITS` are peculiar. If the neuron activates for German, it does so for words related to German. However, the task is to find a *pattern*. The pattern of list items seems stronger.Let's re-evaluate the `TOP_POSITIVE_LOGITS`. They are German words. 'durch', 'euch', 'auch', 'aus', 'nach' are common prepositions/adverbs. 'bege', 'spezi', 'eigenstates', 'leichter', 'akan' are harder to categorize immediately without context.Could the neuron be detecting lists *in German*? Or simply lists, and German is a secondary strong signal?Looking at the `TOP_ACTIVATING_TEXTS` again, the CATALAN text "Sol. 2. Venus: Un planeta calorós i dens conegut per albergar vida. 3. Terra: El nostre planeta natal, l'únic conegut per albergar vida. 4." is a numbered list. This supports the "list" pattern.What if the neuron detects list items, especially if they are preceded by common articles ('the') or numbers, and potentially in languages other than English, or specifically German items.Let's prioritize the most evident pattern. The list structure is seen in `TOP_ACTIVATING_TEXTS` and implied by `TOKENS_AFTER_MAX_ACTIVATING_TOKEN`."list items and formatting" -> Too generic?"enumerated lists and bullet points" -> Better, but maybe too long."bullet points and lists" -> Concise.bullet points and lists

    np_acts-logits-general · gemini-2.5-flash-lite

    The main thing this neuron does is detect numeric tokens (numbers and numerical expressions).

    oai_token-act-pair · o4-miniTriggered by @jyhe0408
    New Auto-Interp
    Top Features by Cosine Similarity
    Configuration
    google/gemma-scope-2-12b-pt/resid_post/layer_24_width_16k_l0_medium
    Prompts (Dashboard)
    392,802 prompts, 256 tokens each
    Dataset (Dashboard)
    monology/pile-uncopyrighted
    No Configuration Found
    Embeds
    IFrame
    Link
    Not in Any Lists

    No Comments

    Negative Logits
    In
    1.12
    People
    1.10
    On
    1.08
    I
    1.02
    But
    1.00
    It
    0.99
     हर्षवर्धन
    0.98
    Since
    0.98
    President
    0.96
     Они
    0.96
    POSITIVE LOGITS
     bei
    0.93
     doen
    0.88
     nehmen
    0.87
     auch
    0.86
     identific
    0.85
     determinate
    0.85
     bege
    0.84
     neglig
    0.84
     diffe
    0.84
     jouw
    0.84
    Activations Density 0.096%

    No Known Activations