INDEX

Explanations

comparisons in terms of improvement or decline over time

oai_token-act-pair · gpt-3.5-turbo Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 8-res_fs768-jb

Configuration

jbloom/GPT2-Small-Feature-Splitting-Experiment-Layer-8/blocks.8.hook_resid_pre_768

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

768

Data Type

torch.float32

Hook Point

blocks.8.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Became

-0.58

spin

-0.55

annels

-0.54

unique

-0.53

ointed

-0.53

urai

-0.52

 Finally

-0.52

 PLEASE

-0.52

 finally

-0.52

mail

-0.51

POSITIVE LOGITS

 predecessors

1.30

 previous

1.24

 counterparts

1.05

 predecessor

1.01

 usual

0.99

 previously

0.95

 earlier

0.95

usual

0.93

 preceding

0.90

 elsewhere

0.89

Activations Density 5.346%

comparisons in terms of improvement or decline over time

No Comments

No Known Activations

comparisons in terms of improvement or decline over time

No Comments

No Known Activations