INDEX

Explanations

Hag followed by noun/medical suffixes

np_acts-logits-general · gemini-2.5-flash-lite

the text string "Hag" followed by different endings like "rid", "ai", "ley" or "ans", often detecting character names or surnames.

oai_token-act-pair · claude-3-7-sonnet-20250219 Triggered by @neilrathi

The neuron responds to the character sequence “hag” wherever it appears in words or names.

oai_token-act-pair · o4-mini Triggered by @jyhe0408

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_10/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

笵

-2.84

臯

-2.69

-2.45

鉨

-2.34

熥

-2.28

那個

-2.25



-2.22

媄

-2.19

夕方

-2.16

𝅥

-2.16

POSITIVE LOGITS

3.02

“

2.36

鈈

2.33

栳

2.33

 themſelves

2.30

2.23

2.16

𝐒

2.16

 itſelf

2.16

 koning

2.16

Activations Density 0.005%

Hag followed by noun/medical suffixes

the text string "Hag" followed by different endings like "rid", "ai", "ley" or "ans", often detecting character names or surnames.

The neuron responds to the character sequence “hag” wherever it appears in words or names.

No Comments

No Known Activations

Hag followed by noun/medical suffixes

the text string "Hag" followed by different endings like "rid", "ai", "ley" or "ans", often detecting character names or surnames.

The neuron responds to the character sequence “hag” wherever it appears in words or names.

No Comments

No Known Activations