INDEX

Explanations

mentions of specific names, potentially related to journalism or reporting

oai_token-act-pair · gpt-3.5-turbo Triggered by @bot

New Auto-Interp

Configuration

jbloom/Gemma-2b-Residual-Stream-SAEs/gemma_2b_blocks.6.hook_resid_post_16384_anthropic_fast_lr

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

HuggingFaceFW/fineweb

Features

16,384

Data Type

torch.float32

Hook Point

blocks.6.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

HuggingFaceFW/fineweb

Hook Point Layer

Activation Function

relu

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

<bos>

-0.90

 papà

-0.59

 chante

-0.59

jectures

-0.58

 portait

-0.58

plaatst

-0.56

voirs

-0.55

 curé

-0.55

 contributo

-0.54

 sindaco

-0.54

POSITIVE LOGITS

 Roger

1.48

Roger

1.40

 ROGER

1.29

 roger

1.26

roger

1.05

 Rogers

0.94

Rogers

0.89

Rog

0.81

Rog

0.76

 ROGERS

0.71

Activations Density 0.505%

mentions of specific names, potentially related to journalism or reporting

No Comments

No Known Activations

mentions of specific names, potentially related to journalism or reporting

No Comments

No Known Activations