INDEX

Explanations

words related to character names and interactions in a narrative context

oai_token-act-pair · gpt-3.5-turbo

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_scl-ajt/6-res_scl-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

•Layer 6 UMAP region: Mostly-local cluster on left - local

No Comments

Negative Logits

DOWN

-0.77

hement

-0.73

edin

-0.73

SPONSORED

-0.71

INAL

-0.70

 largeDownload

-0.70

reement

-0.69

andowski

-0.68

Down

-0.68

owship

-0.68

POSITIVE LOGITS

laws

0.88

 unnoticed

0.77

 virtue

0.76

 proxy

0.75

products

0.74

product

0.74

 stealth

0.70

 Proxy

0.68

gone

0.67

 leaps

0.67

Activations Density 9.181%

words related to character names and interactions in a narrative context

No Comments

No Known Activations