INDEX

Explanations

family relationships or family-related terms, particularly focusing on parent-child relationships

oai_token-act-pair · gpt-3.5-turbo

references to individuals involved in incidents, particularly in a legal or distressing context

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 11-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.11.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.11.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

izoph

-0.72

raltar

-0.66

peed

-0.65

ichick

-0.64

åĮ

-0.62

bes

-0.62

pell

-0.61

orting

-0.60

spons

-0.60

OME

-0.60

POSITIVE LOGITS

's

0.98

 fled

0.89

 suffered

0.88

 surn

0.88

 died

0.87

 belonged

0.86

was

0.85

 testified

0.85

 awoke

0.82

 withdrew

0.82

Activations Density 0.146%

family relationships or family-related terms, particularly focusing on parent-child relationships

references to individuals involved in incidents, particularly in a legal or distressing context

No Comments

No Known Activations

family relationships or family-related terms, particularly focusing on parent-child relationships

references to individuals involved in incidents, particularly in a legal or distressing context

No Comments

No Known Activations