INDEX

Explanations

words related to loyalty

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Configuration

neuronpedia/gpt2-small__res_slefr-ajt/2-res_slefr-ajt

Prompts (Dashboard)

12,288 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

46,080

Data Type

torch.float32

Hook Point

blocks.2.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

apollo-research/Skylion007-openwebtext-tokenizer-gpt2

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

externalActionCode

-0.80

soDeliveryDate

-0.68

ĸļ

-0.68

 Journals

-0.65

Else

-0.64

 Genetics

-0.64

 Anthropology

-0.63

 Seeds

-0.62

 Jill

-0.61

è¦ļéĨĴ

-0.60

POSITIVE LOGITS

ty

1.04

ties

0.82

thouse

0.73

iful

0.70

erate

0.69

 loyal

0.68

eties

0.66

izations

0.65

itionally

0.65

elin

0.65

Activations Density 0.006%

words related to loyalty

No Comments

No Known Activations