INDEX

Explanations

complex sequences of characters that likely represent symbols or special characters

oai_token-act-pair · gpt-3.5-turbo

references to box office earnings of popular films

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 4-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.4.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.4.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

".[

-0.27

).[

-0.24

Pwr

-0.22

.""

-0.22

)."

-0.22

]."

-0.22

."[

-0.22

}.

-0.21

''.

-0.20

)).

-0.20

POSITIVE LOGITS

iaries

0.22

ovember

0.22

itialized

0.19

itzer

0.19

earch

0.19

viation

0.18

uff

0.18

eport

0.18

ruce

0.17

lash

0.17

Activations Density 10.509%

complex sequences of characters that likely represent symbols or special characters

references to box office earnings of popular films

No Comments

No Known Activations