INDEX

Explanations

mentions of a particular fictional character

oai_token-act-pair · gpt-3.5-turbo

references to the character Gandalf from the "Lord of the Rings" and "Hobbit" series

oai_token-act-pair · gpt-4o-mini Triggered by @bot

Lord of the ring related characters, concepts and places.

Explanation Uploaded by User

Lord of the ring related characters, concepts and places, but they need to start with a capital letter.

Explanation Uploaded by User

Words related to Lord of the rings, or that sound like words related to Lord of the Rings.

Explanation Uploaded by User

New Auto-Interp

Top Features by Cosine Similarity

Comparing With GPT2-SMALL @ 2-res-jb

Configuration

jbloom/GPT2-Small-SAEs-Reformatted/blocks.2.hook_resid_pre

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

Skylion007/openwebtext

Features

24,576

Data Type

torch.float32

Hook Point

blocks.2.hook_resid_pre

Architecture

standard

Context Size

128

Dataset

Skylion007/openwebtext

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

•LOTR

No Comments

Negative Logits

âĸ¬

-0.81

aminer

-0.78

âķĲ

-0.72

âķĲâķĲ

-0.71

Ŀ

-0.69

amples

-0.68

hift

-0.68

ciples

-0.68

packages

-0.66

eper

-0.63

POSITIVE LOGITS

olkien

0.99

 Tolkien

0.98

 Gand

0.96

 Hobbit

0.95

 Bagg

0.93

alf

0.83

hob

0.81

gob

0.79

inka

0.75

 Lann

0.75

Activations Density 0.024%

mentions of a particular fictional character

references to the character Gandalf from the "Lord of the Rings" and "Hobbit" series

Lord of the ring related characters, concepts and places.

Lord of the ring related characters, concepts and places, but they need to start with a capital letter.

Words related to Lord of the rings, or that sound like words related to Lord of the Rings.

No Comments

No Known Activations