INDEX

Explanations

attends to specific terms or phrases when they are referenced or quoted from a later context, particularly focusing on a pattern involving desired tokens in a technical context

oai_attention-head · gpt-4o-mini Triggered by @bot

New Auto-Interp

Configuration

google/gemma-scope-9b-pt-att/layer_0/width_16k/average_l0_61

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Features

16,384

Data Type

float32

Hook Name

blocks.0.attn.hook_z

Hook Layer

Architecture

jumprelu

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Activation Function

relu

Embeds

PlotsExplanationShow Test FieldDefault Test Text

IFrame

Link

Not in Any Lists

No Comments

Head Attr Weights

0:0.02

1:0.01

2:0.02

3:0.03

4:0.03

5:0.02

6:0.04

7:0.03

8:0.02

9:0.02

10:0.02

11:0.03

12:0.50

13:0.07

14:0.05

15:0.02

Negative Logits

-1.36

-1.13

in

-0.98

and

-0.93

of

-0.91

-0.87

to

-0.84

for

-0.84

↵↵

-0.83

POSITIVE LOGITS

<unused43>

1.98

<pad>

1.98

<unused41>

1.98

<unused74>

1.98

<unused51>

1.97

<unused42>

1.97

<unused23>

1.97

<unused14>

1.96

<unused3>

1.96

[@BOS@]

1.96

Activations Density 0.371%

attends to specific terms or phrases when they are referenced or quoted from a later context, particularly focusing on a pattern involving desired tokens in a technical context

No Comments

No Known Activations