INDEX

Explanations

the pronoun "he" and its related forms, sometimes near time-related words and other pronouns.

oai_token-act-pair · gemini-2.0-flash

Narrative

np_max-act-logits · gemini-2.0-flash

pronouns Method used: 1 — because top tokens are subject pronouns — pronouns

np_max-act-logits · gpt-5-mini Triggered by @chenshw0109

New Auto-Interp

Configuration

google/gemma-scope-2b-pt-transcoders/layer_24/width_16k/average_l0_37

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Features

16,384

Data Type

float32

Hook Name

blocks.24.ln2.hook_normalized

Architecture

jumprelu_transcoder

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 could

-1.45

 Could

-1.40

could

-1.39

Could

-1.38

 könnten

-1.03

 pourraient

-0.97

 COULD

-0.93

 kunne

-0.90

 podrían

-0.89

podr

-0.88

POSITIVE LOGITS

may

0.96

May

0.79

May

0.74

may

0.70

MAY

0.67

MAY

0.57

idopsis

0.54

likon

0.54

DIPSETTING

0.52

kloped

0.50

Activations Density 2.600%

the pronoun "he" and its related forms, sometimes near time-related words and other pronouns.

Narrative

pronouns Method used: 1 — because top tokens are subject pronouns — pronouns

No Comments

No Known Activations

the pronoun "he" and its related forms, sometimes near time-related words and other pronouns.

Narrative

pronouns Method used: 1 — because top tokens are subject pronouns — pronouns

No Comments

No Known Activations