INDEX

Explanations

modifiers before words

np_acts-logits-general · gemini-2.5-flash-lite

The neuron activates on terms that label hierarchical subdivisions or categories (e.g. “sub-”, “major/minor,” “secondary,” “aggregate,” “categories”).

oai_token-act-pair · o4-mini Triggered by @jyhe0408

the word "sub" when it appears as a prefix or in hierarchical/categorical contexts (like "sub-categories", "sub-regional", "subtests").

oai_token-act-pair · claude-4-5-sonnet Triggered by @jyhe0408

mentions of hierarchical structure—subdivisions, levels (major/minor, upper/lower, primary/secondary), and aggregate groupings within a system.

oai_token-act-pair · gpt-5 Triggered by @jyhe0408

New Auto-Interp

Configuration

google/gemma-scope-2-12b-pt/resid_post/layer_24_width_16k_l0_medium

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 könnte

0.71

uppe

0.68

स्तान

0.65

 away

0.64

 konnte

0.64

ador

0.63

people

0.63

no

0.61

 bisa

0.61

sı

0.61

POSITIVE LOGITS

 subdivided

0.76

 වශයෙන්

0.70

 spezif

0.68

分为

0.65

する場合は

0.64

 specialized

0.63

 Specify

0.63

 subgroups

0.62

 अर्थात्

0.61

 これらの

0.61

Activations Density 0.026%

modifiers before words

The neuron activates on terms that label hierarchical subdivisions or categories (e.g. “sub-”, “major/minor,” “secondary,” “aggregate,” “categories”).

the word "sub" when it appears as a prefix or in hierarchical/categorical contexts (like "sub-categories", "sub-regional", "subtests").

mentions of hierarchical structure—subdivisions, levels (major/minor, upper/lower, primary/secondary), and aggregate groupings within a system.

No Comments

No Known Activations

modifiers before words

The neuron activates on terms that label hierarchical subdivisions or categories (e.g. “sub-”, “major/minor,” “secondary,” “aggregate,” “categories”).

the word "sub" when it appears as a prefix or in hierarchical/categorical contexts (like "sub-categories", "sub-regional", "subtests").

mentions of hierarchical structure—subdivisions, levels (major/minor, upper/lower, primary/secondary), and aggregate groupings within a system.

No Comments

No Known Activations