INDEX

Explanations

no

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_23/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.23.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

.bias

-0.08

(g

-0.07

(Roles

-0.07

且

-0.07

 tuition

-0.07

 tqdm

-0.07

 chương

-0.07

 WRITE

-0.07

(Button

-0.07

/arm

-0.07

POSITIVE LOGITS

Unc

0.08

//================================================================================

0.07

 popularity

0.07

scripción

0.07

 supporters

0.07

 eyel

0.06

徽

0.06

 simplified

0.06

 depreci

0.06

稱

0.06

Activations Density 0.002%

no

No Comments

No Known Activations

no

No Comments

No Known Activations