INDEX

Explanations

No Explanations Found

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 diff

-0.07

重要指示

-0.07

TEXT

-0.06

מטה

-0.06

☈

-0.06

中小学

-0.06

Address

-0.06

 mornings

-0.06

ניוז

-0.06

/books

-0.06

POSITIVE LOGITS

ˆ

0.07

嬷

0.07

 Boots

0.07

 //</

0.07

保守

0.07

 especial

0.07

 occupations

0.07

 часов

0.07

جاب

0.07

 Luxembourg

0.07

Activations Density 0.233%

No Comments

No Known Activations