INDEX

Explanations

references to or mentions of the word "Bunny"

oai_token-act-pair · gpt-3.5-turbo

references to "Bunny" and related terms, particularly in a whimsical or light-hearted context

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 0-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.0.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.0.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

umar

-0.76

ERA

-0.72

ESCO

-0.69

 oxide

-0.69

ridor

-0.67

olith

-0.66

 SATA

-0.66

itage

-0.65

eneg

-0.65

ignment

-0.64

POSITIVE LOGITS

 Bunny

3.56

 bunny

1.90

liness

1.40

 Wilde

1.27

 imperson

1.12

vernment

1.10

nard

1.05

unny

0.99

 droid

0.95

fox

0.94

Activations Density 0.057%

references to or mentions of the word "Bunny"

references to "Bunny" and related terms, particularly in a whimsical or light-hearted context

No Comments

No Known Activations