© Neuronpedia 2026
    Privacy & TermsBlogGitHubSlackTwitterContact
    Neuronpedia logo - a computer chip with a rounded viewfinder border around it

    Neuronpedia

    Natural Language
    Autoencoders
    NEW
    Assistant AxisNEWCircuit TracerUPDATESteerSAE EvalsExportsAPI Community BlogPrivacy & TermsContact
    1. Home
    2. Gemma-3-12B
    3. 24-GEMMASCOPE-2-RES-16K
    4. 14013
    Prev
    Next
    INDEX
    Explanations

    - The presence of these tokens points towards punctuation and conjunctions, particularly in languages other than English, and general sentence-ending or separating markers.Considering the prompt's constraints and the data:`MAX_ACTIVATING_TOKENS` contains a mix of Chinese characters and "recover".`TOKENS_AFTER_MAX_ACTIVATING_TOKEN` are mostly punctuation.Let's look at the `TOP_ACTIVATING_TEXTS`:- "...处理日本。这更加激化了日方的不满。 **2. 经过** * **火车爆炸:** 1928年6月"- "...付出努力。 **建议:** * **认真思考自己的价值观和人生目标:** 婚姻是否符合你"- "...源氏需要预判她的走位,提前进行埋伏。 * **配合队友:** 与队友配合,例如由坦克掩护"- "...那次、那時、那個,都息息相關啊。 **小麗:** 没!!**Chinese punctuation and conjunctions**

    np_acts-logits-general · gemini-2.5-flash-lite

    The neuron strongly activates on multi‐syllabic abstract nouns that denote high‐level concepts, states, or evaluative notions.

    oai_token-act-pair · o4-miniTriggered by @jyhe0408
    New Auto-Interp
    Top Features by Cosine Similarity
    Configuration
    google/gemma-scope-2-12b-pt/resid_post/layer_24_width_16k_l0_medium
    Prompts (Dashboard)
    392,802 prompts, 256 tokens each
    Dataset (Dashboard)
    monology/pile-uncopyrighted
    No Configuration Found
    Embeds
    IFrame
    Link
    Not in Any Lists

    No Comments

    Negative Logits
     all
    0.77
     edgy
    0.68
     categorization
    0.67
     marvel
    0.65
     behaviour
    0.62
     calorie
    0.61
     Stream
    0.60
     ovoj
    0.60
     behavior
    0.59
     categor
    0.59
    POSITIVE LOGITS
    н
    0.63
    ность
    0.60
    Ю
    0.59
    stackpath
    0.58
     হিন্দি
    0.58
    ة
    0.57
    한편
    0.56
    tif
    0.56
    ЕТ
    0.54
    いです
    0.54
    Activations Density 0.008%

    No Known Activations