INDEX

Explanations

specific nouns and activities

np_acts-logits-general · gemini-2.5-flash-lite

This neuron spikes on individual tokens that are part of named entities or proper names (e.g. titles, character or product names, specialized jargon), effectively detecting proper nouns.

oai_token-act-pair · o4-mini Triggered by @yooniel31

This neuron detects section headings, titles, and other named entities—capitalized proper nouns and title-like phrases (e.g., game/book/character or list-item headings).

oai_token-act-pair · gpt-5-mini Triggered by @vetterc0

section headers and emphasized, title-style phrases—especially bolded list items and content‑heavy proper-noun keywords.

oai_token-act-pair · gpt-5 Triggered by @yooniel31

New Auto-Interp

Configuration

google/gemma-scope-2-27b-it/resid_post/layer_31_width_262k_l0_medium

Prompts (Dashboard)

238,145 prompts, 512 tokens each

Dataset (Dashboard)

lmsys + oasst1

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 socalled

0.17

 فريبي

0.15

 groupBox

0.14

 atthe

0.14

 determinadas

0.14

 cannot

0.14

'='

0.14

 geheel

0.14

 yattha

0.14

 Elektrokhimiya

0.13

POSITIVE LOGITS

ד

0.23

ra

0.21

ן

0.20

 গতকাল

0.19

ด

0.19

尔

0.19

ק

0.18

 நேற்று

0.18

ע

0.18

Activations Density 0.703%

specific nouns and activities

This neuron spikes on individual tokens that are part of named entities or proper names (e.g. titles, character or product names, specialized jargon), effectively detecting proper nouns.

This neuron detects section headings, titles, and other named entities—capitalized proper nouns and title-like phrases (e.g., game/book/character or list-item headings).

section headers and emphasized, title-style phrases—especially bolded list items and content‑heavy proper-noun keywords.

No Comments

No Known Activations

specific nouns and activities

This neuron spikes on individual tokens that are part of named entities or proper names (e.g. titles, character or product names, specialized jargon), effectively detecting proper nouns.

This neuron detects section headings, titles, and other named entities—capitalized proper nouns and title-like phrases (e.g., game/book/character or list-item headings).

section headers and emphasized, title-style phrases—especially bolded list items and content‑heavy proper-noun keywords.

No Comments

No Known Activations