INDEX

Explanations

No Explanations Found

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

涂抹

-0.07

تط

-0.07

 Retro

-0.07

˗

-0.07

lust

-0.07

overnment

-0.07

تكن

-0.07

 clandest

-0.06

 właśnie

-0.06

背叛

-0.06

POSITIVE LOGITS

BLEM

0.07

_feat

0.07

我说

0.07

 zeroes

0.07

万亿元

0.07

urgy

0.07

 ==============================================================

0.07

,img

0.06

.TestTools

0.06

 illustrates

0.06

Activations Density 0.006%

No Comments

No Known Activations