INDEX

Explanations

names or terms with special characters, such as accents or non-English letters

oai_token-act-pair · gpt-3.5-turbo

references to prominent political figures or events

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 0-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.0.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.0.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Hunts

-0.64

 rear

-0.63

 messages

-0.61

 jobs

-0.60

market

-0.60

 Liberty

-0.58

 store

-0.58

Meg

-0.58

SEC

-0.58

 storage

-0.58

POSITIVE LOGITS

Äĩ

5.35

Äį

2.60

Å¡

1.89

ÅŁ

1.75

ÅĤ

1.69

ÄŁ

1.63

 Croatian

1.52

kson

1.50

tsky

1.43

ovic

1.42

Activations Density 0.015%

names or terms with special characters, such as accents or non-English letters

references to prominent political figures or events

No Comments

No Known Activations

names or terms with special characters, such as accents or non-English letters

references to prominent political figures or events

No Comments

No Known Activations