INDEX

Explanations

expressions of desire or intention related to performing actions

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Configuration

jbloom/Gemma-2b-Residual-Stream-SAEs/gemma_2b_blocks.10.hook_resid_post_16384

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

chanind/openwebtext-gemma

Features

16,384

Data Type

float32

Hook Name

blocks.10.hook_resid_post

Hook Layer

Architecture

standard

Context Size

1,024

Dataset

ctigges/openwebtext-gemma-1024-cl

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

<bos>

-2.27

/***

-0.83

-0.75

/**

-0.75

///**

-0.69

 intersper

-0.67

ⓧ

-0.65

<?

-0.65

};*/

-0.55

beforeAll

-0.55

POSITIVE LOGITS

 venuto

1.10

 signora

0.91

 sorella

0.82

 santiago

0.80

 bambina

0.79

 beverly

0.76

 toledo

0.74

Grath

0.74

 liberality

0.74

 farfetch

0.73

Activations Density 0.187%

expressions of desire or intention related to performing actions

No Comments

No Known Activations

expressions of desire or intention related to performing actions

No Comments

No Known Activations