INDEX

Explanations

statements about experiences, behaviors, and actions

oai_token-act-pair · gpt-3.5-turbo

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_scefr-ajt/6-res_scefr-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

hips

-0.83

itatively

-0.76

Priv

-0.73

ielding

-0.71

ãĤ½

-0.70

busters

-0.70

 è£ıè¦ļéĨĴ

-0.69

 Eighth

-0.67

 Institution

-0.67

 Polk

-0.66

POSITIVE LOGITS

chy

1.26

unes

1.14

iner

1.10

ain

1.10

asca

1.02

 wasn

1.02

self

1.01

 seems

1.00

 happened

0.99

beh

0.96

Activations Density 1.743%

statements about experiences, behaviors, and actions

No Comments

No Known Activations