INDEX

Explanations

conversational text speaking about certainty, planning, or intent.

oai_token-act-pair · gemini-2.0-flash

Two-character strings

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

google/gemma-scope-2b-pt-transcoders/layer_25/width_16k/average_l0_41

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Features

16,384

Data Type

float32

Hook Name

blocks.25.ln2.hook_normalized

Architecture

jumprelu_transcoder

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-0.57

 démocr

-0.57

 courants

-0.56

 fermés

-0.56

↵↵

-0.56

 détru

-0.55

..."

-0.55

 cérami

-0.54

)$\\

-0.54

 écou

-0.54

POSITIVE LOGITS

ISupport

0.74

0.64

dm

0.63

hq

0.62

hp

0.61

bg

0.60

bs

0.59

Fx

0.59

FF

0.58

hb

0.58

Activations Density 52.238%

conversational text speaking about certainty, planning, or intent.

Two-character strings

No Comments

No Known Activations

conversational text speaking about certainty, planning, or intent.

Two-character strings

No Comments

No Known Activations