INDEX

Explanations

phrases indicating a lack of concern or care towards someone or something

oai_token-act-pair · gpt-3.5-turbo

discussions centered around indifference or lack of concern

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 10-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.10.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.10.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

arm

-0.84

Beta

-0.80

NAS

-0.80

cue

-0.80

DragonMagazine

-0.78

Cent

-0.76

Consider

-0.75

igmatic

-0.74

Sus

-0.74

auri

-0.74

POSITIVE LOGITS

 specifics

1.06

 aesthetics

1.05

 whether

1.03

 politics

1.02

 anything

0.99

 winning

0.98

 preserving

0.98

 money

0.97

 semantics

0.95

 getting

0.95

Activations Density 0.198%

phrases indicating a lack of concern or care towards someone or something

discussions centered around indifference or lack of concern

No Comments

No Known Activations

phrases indicating a lack of concern or care towards someone or something

discussions centered around indifference or lack of concern

No Comments

No Known Activations