INDEX

Explanations

phrases that instruct or suggest actions

oai_token-act-pair · gpt-3.5-turbo

instances of the word "to"

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 0-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.0.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.0.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

lav

-0.69

eryl

-0.66

cephal

-0.66

ordon

-0.63

rell

-0.61

ifer

-0.59

bridge

-0.58

ery

-0.57

rys

-0.56

 encamp

-0.56

POSITIVE LOGITS

TO

3.01

TO

1.80

 INTO

1.64

FOR

1.60

 FROM

1.60

OF

1.59

ON

1.57

 ABOUT

1.56

IN

1.49

BY

1.49

Activations Density 0.024%

phrases that instruct or suggest actions

instances of the word "to"

No Comments

No Known Activations