INDEX

Explanations

questions or statements expressing uncertainty or speculation

oai_token-act-pair · gpt-3.5-turbo

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_scl-ajt/6-res_scl-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ciating

-0.68

ertodd

-0.66

herent

-0.58

phrine

-0.58

 Vital

-0.54

umsy

-0.54

cakes

-0.54

etsk

-0.54

ACTION

-0.52

PRESS

-0.52

POSITIVE LOGITS

?),

0.66

?!

0.58

?).

0.58

?:

0.57

 darn

0.56

?)

0.56

 suppose

0.56

 wonder

0.55

 chalk

0.53

 prest

0.53

Activations Density 7.734%

questions or statements expressing uncertainty or speculation

No Comments

No Known Activations