INDEX

Explanations

expressions of gratitude or appreciation

oai_token-act-pair · gpt-3.5-turbo

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_scl-ajt/6-res_scl-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 deviation

-0.64

 hardness

-0.62

itic

-0.62

 projecting

-0.61

é¾įå¥ĳå£«

-0.60

ãĤ¨ãĥ«

-0.59

Osc

-0.59

["

-0.58

 Luxem

-0.58

inese

-0.57

POSITIVE LOGITS

gments

0.77

gements

0.76

 Thank

0.71

Override

0.68

giving

0.68

ride

0.65

ribly

0.65

bles

0.65

ickets

0.64

 recipients

0.63

Activations Density 2.793%

expressions of gratitude or appreciation

No Comments

No Known Activations