INDEX

Explanations

negative sentiments such as criticisms or concerns in the text

oai_token-act-pair · gpt-3.5-turbo

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_scefr-ajt/6-res_scefr-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

hoe

-0.98

Downloadha

-0.95

anwhile

-0.93

£ı

-0.83

Nusra

-0.80

tone

-0.79

 Ferr

-0.75

ãĥ¯ãĥ³

-0.75

 chants

-0.75

bone

-0.74

POSITIVE LOGITS

avorite

1.07

ortun

1.04

ancies

1.02

ortunate

1.02

ixed

1.00

athom

1.00

rost

1.00

avour

0.99

itted

0.97

iaries

0.96

Activations Density 0.225%

negative sentiments such as criticisms or concerns in the text

No Comments

No Known Activations

negative sentiments such as criticisms or concerns in the text

No Comments

No Known Activations