INDEX

Explanations

names of characters or actors playing characters in films

oai_token-act-pair · gpt-3.5-turbo

references to notable film characters and their roles

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 10-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.10.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.10.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

istries

-1.07

AMS

-0.93

ournals

-0.89

views

-0.89

ICES

-0.87

atories

-0.87

orders

-0.85

ISO

-0.84

Apps

-0.84

tests

-0.84

POSITIVE LOGITS

 protagonist

1.29

 narrator

1.26

 bearded

1.26

 villain

1.26

guy

1.24

 heroine

1.17

 handsome

1.17

 blonde

1.17

 blond

1.13

 thief

1.13

Activations Density 0.326%

names of characters or actors playing characters in films

references to notable film characters and their roles

No Comments

No Known Activations