INDEX

Explanations

instances of conjunctions, particularly "and."

oai_token-act-pair · gpt-4o-mini Triggered by @bot

New Auto-Interp

Configuration

jbloom/Gemma-2b-Residual-Stream-SAEs/gemma_2b_blocks.12.hook_resid_post_16384

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

HuggingFaceFW/fineweb

Features

16,384

Data Type

torch.float32

Hook Point

blocks.12.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

HuggingFaceFW/fineweb

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 intersper

-1.05

 felicity

-0.90

 encomp

-0.87

 liberality

-0.87

 Thos

-0.86

 quitted

-0.82

 Pamph

-0.82

 gaily

-0.81

 Shakspeare

-0.79

 Augu

-0.76

POSITIVE LOGITS

GYPT

0.74

 Muito

0.73

 Talvez

0.72

 Adicion

0.72

sizePolicy

0.71

 Estou

0.69

 Hitam

0.68

visející

0.67

 confronti

0.67

 Saludos

0.65

Activations Density 0.277%

instances of conjunctions, particularly "and."

No Comments

No Known Activations

instances of conjunctions, particularly "and."

No Comments

No Known Activations