INDEX

Explanations

documents that mention the act of destroying or tearing something apart

oai_token-act-pair · gpt-3.5-turbo

terms related to fragmentation or tearing apart, particularly in a figurative or literal sense

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 5-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.5.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.5.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Kamp

-0.69

xon

-0.69

Nou

-0.68

stown

-0.66

 Belg

-0.64

ague

-0.63

asia

-0.62

pect

-0.61

 Parkinson

-0.61

lda

-0.60

POSITIVE LOGITS

ding

1.38

 shred

0.99

sburg

0.97

aby

0.92

 tremend

0.87

0.86

itionally

0.83

soever

0.81

anguage

0.81

 horm

0.80

Activations Density 0.024%

documents that mention the act of destroying or tearing something apart

terms related to fragmentation or tearing apart, particularly in a figurative or literal sense

No Comments

No Known Activations

documents that mention the act of destroying or tearing something apart

terms related to fragmentation or tearing apart, particularly in a figurative or literal sense

No Comments

No Known Activations