INDEX

Explanations

phrases related to social and political issues, economic inequality, and community welfare

oai_token-act-pair · gpt-3.5-turbo

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_scl-ajt/6-res_scl-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

auga

-0.62

uador

-0.59

arter

-0.57

ande

-0.55

thood

-0.55

zie

-0.54

nikov

-0.53

andum

-0.53

ritz

-0.53

gat

-0.52

POSITIVE LOGITS

pires

0.77

 turns

0.65

 translates

0.61

ifies

0.57

 happens

0.56

 turned

0.54

 itself

0.54

ãĤ©

0.53

 describes

0.53

¢

0.53

Activations Density 12.495%

phrases related to social and political issues, economic inequality, and community welfare

No Comments

No Known Activations