INDEX

Explanations

instances of the word "seen"

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Configuration

jbloom/Gemma-2b-Residual-Stream-SAEs/gemma_2b_blocks.12.hook_resid_post_16384

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

HuggingFaceFW/fineweb

Features

16,384

Data Type

torch.float32

Hook Point

blocks.12.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

HuggingFaceFW/fineweb

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

<bos>

-2.15

 Maryland

-0.52

Enllaços

-0.51

sol

-0.50

tons

-0.50

 Tony

-0.49

 Mary

-0.49

낼

-0.48

 Nieder

-0.47

Iné

-0.46

POSITIVE LOGITS

 chrysler

1.08

 kapag

1.07

 Seen

1.04

Seen

1.04

 seen

1.03

 nutella

1.03

 tucson

1.01

 oreo

0.99

 errone

0.96

 mcdonald

0.96

Activations Density 0.104%

instances of the word "seen"

No Comments

No Known Activations

instances of the word "seen"

No Comments

No Known Activations