INDEX

Explanations

phrases indicating a shift in focus or topic

oai_token-act-pair · gpt-3.5-turbo

phrases that express contrast or separation in relation to other concepts

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 5-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.5.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.5.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

oping

-0.63

cat

-0.63

iser

-0.62

å£«

-0.61

pees

-0.61

eway

-0.61

 tumble

-0.60

aimon

-0.58

ebra

-0.57

eries

-0.57

POSITIVE LOGITS

heid

1.29

isphere

0.88

comings

0.87

ments

0.82

icularly

0.82

Ħ¢

0.75

landish

0.74

lihood

0.73

ractor

0.72

ional

0.72

Activations Density 0.018%

phrases indicating a shift in focus or topic

phrases that express contrast or separation in relation to other concepts

No Comments

No Known Activations

phrases indicating a shift in focus or topic

phrases that express contrast or separation in relation to other concepts

No Comments

No Known Activations