INDEX

Explanations

medical conditions, places, and proper nouns

np_acts-logits-general · gemini-2.5-flash-lite

non-English words, particularly from Asian and Middle Eastern languages.

oai_token-act-pair · claude-3-7-sonnet-20250219 Triggered by @neilrathi

This neuron fires on uncommon subword fragments—especially those from rare or technical terms, non-English proper names, medical jargon, or code/hex sequences.

oai_token-act-pair · o4-mini Triggered by @jyhe0408

New Auto-Interp

Configuration

google/gemma-scope-27b-pt-res/layer_10/width_131k

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

(>

-1.97

of

-1.78

any

-1.73

be

-1.68

 даже

-1.60

$\

-1.56

𓍊

-1.55

-1.53

 любви

-1.52

 любой

-1.51

POSITIVE LOGITS

';

1.90

 ſuch

1.88

erdings

1.88

汌

1.83

泚

1.82

 駅前

1.77

 emplois

1.75

berra

1.75

取决

1.75

 againſt

1.73

Activations Density 0.101%

medical conditions, places, and proper nouns

non-English words, particularly from Asian and Middle Eastern languages.

This neuron fires on uncommon subword fragments—especially those from rare or technical terms, non-English proper names, medical jargon, or code/hex sequences.

No Comments

No Known Activations

medical conditions, places, and proper nouns

non-English words, particularly from Asian and Middle Eastern languages.

This neuron fires on uncommon subword fragments—especially those from rare or technical terms, non-English proper names, medical jargon, or code/hex sequences.

No Comments

No Known Activations