INDEX

Explanations

Japanese characters with specific strokes and proportions

oai_token-act-pair · gpt-3.5-turbo

specific non-English characters or symbols

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 6-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.6.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ESS

-0.71

 Beir

-0.71

 tort

-0.66

JPM

-0.64

 hypers

-0.63

 broker

-0.63

 Claus

-0.61

arella

-0.60

afort

-0.59

ODY

-0.58

POSITIVE LOGITS

nen

1.08

Åį

1.07

Å«

1.06

Â·Â·

0.97

nin

0.96

su

0.92

shi

0.91

Ê

0.91

ãĥ³ãĤ¸

0.91

Äģ

0.89

Activations Density 0.008%

Japanese characters with specific strokes and proportions

specific non-English characters or symbols

No Comments

No Known Activations