INDEX

Explanations

mentions of the brand "Lego"

oai_token-act-pair · gpt-3.5-turbo

references to the Lego brand

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 0-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.0.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.0.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 suppress

-0.81

 suppressed

-0.81

 Shir

-0.81

 suppressing

-0.78

 suppression

-0.69

 sickness

-0.67

 counseling

-0.66

tab

-0.65

 disp

-0.65

 stricken

-0.65

POSITIVE LOGITS

 Lego

4.02

 LEGO

3.48

 Minecraft

1.72

 Barbie

1.53

Minecraft

1.49

 Brick

1.43

 Transformers

1.40

 Toys

1.40

 Disneyland

1.37

Catalog

1.33

Activations Density 0.033%

mentions of the brand "Lego"

references to the Lego brand

No Comments

No Known Activations

mentions of the brand "Lego"

references to the Lego brand

No Comments

No Known Activations