INDEX

Explanations

references to legal, political, and crime-related topics

oai_token-act-pair · gpt-3.5-turbo

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_scl-ajt/6-res_scl-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

EStream

-0.76

Ô

-0.72

 bounded

-0.71

apr

-0.70

skirts

-0.69

kees

-0.68

dur

-0.68

 neigh

-0.67

 wand

-0.66

kne

-0.65

POSITIVE LOGITS

CS

0.85

CI

0.82

EMS

0.76

EO

0.73

KA

0.73

HQ

0.71

 Investigator

0.71

 Division

0.70

 Group

0.70

LD

0.70

Activations Density 12.885%

references to legal, political, and crime-related topics

No Comments

No Known Activations

references to legal, political, and crime-related topics

No Comments

No Known Activations