INDEX

Explanations

names of people, especially historical figures

oai_token-act-pair · gpt-3.5-turbo

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_scefr-ajt/6-res_scefr-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

comed

-1.06

umpy

-1.05

intern

-1.00

comings

-0.99

icky

-0.97

REL

-0.95

risome

-0.95

efficients

-0.93

EEP

-0.93

aryl

-0.92

POSITIVE LOGITS

III

1.06

sson

1.05

 Roberts

1.04

 Hubbard

1.02

Hes

1.00

 Rodrig

1.00

 Militia

0.99

ovich

0.97

 Tenth

0.94

Gib

0.94

Activations Density 1.655%

names of people, especially historical figures

No Comments

No Known Activations