INDEX

Explanations

unexpected or surprising events and situations

oai_token-act-pair · gpt-3.5-turbo

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_scefr-ajt/6-res_scefr-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 throats

-0.84

 favored

-0.80

 hemor

-0.76

 throat

-0.75

ailability

-0.75

 approved

-0.74

approved

-0.74

ona

-0.73

illes

-0.72

stood

-0.70

POSITIVE LOGITS

 Sharif

0.92

Gaw

0.90

 juxtap

0.89

 parallels

0.86

 irony

0.79

how

0.79

 EDITION

0.78

 Sturgeon

0.78

 Vaugh

0.77

 Manitoba

0.77

Activations Density 1.762%

unexpected or surprising events and situations

No Comments

No Known Activations