INDEX

Explanations

the word "the"

oai_token-act-pair · gemini-2.0-flash

he/the

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

google/gemma-scope-2b-pt-transcoders/layer_4/width_16k/average_l0_88

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Features

16,384

Data Type

float32

Hook Name

blocks.4.ln2.hook_normalized

Architecture

jumprelu_transcoder

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Harness

-0.48

 clef

-0.46

 Slay

-0.45

 Receipt

-0.45

czyna

-0.44

 alve

-0.43

Footnote

-0.41

 réguli

-0.41

 Beet

-0.41

tenis

-0.41

POSITIVE LOGITS

<bos>

3.28

__':

0.90

__":

0.77

/**

0.70

']>

0.70

0.68

})}

0.66

'}>

0.65

0.64

')):

0.64

Activations Density 0.640%

the word "the"

he/the

No Comments

No Known Activations