INDEX

Explanations

No Explanations Found

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

火灾

-0.08

emergency

-0.08

 attract

-0.08

 פעולה

-0.08

roduce

-0.08

 Applied

-0.07

 pornography

-0.07

鸪

-0.07

 дор

-0.07

foreach

-0.07

POSITIVE LOGITS

 excludes

0.08

주의

0.07

_study

0.07

 clases

0.07

kategori

0.07

(_.

0.07

יבו

0.07

aan

0.06

ające

0.06

 кан

0.06

Activations Density 0.001%

No Comments

No Known Activations