INDEX

Explanations

No Explanations Found

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_23/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.23.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Tunis

-0.07

RecognitionException

-0.07

 Mutex

-0.07

火星

-0.06

 furry

-0.06

先是

-0.06

Ava

-0.06

裾

-0.06

 Prepared

-0.06

 Bulgaria

-0.06

POSITIVE LOGITS

[tag

0.07

 programmers

0.07

.phase

0.07

leads

0.07

𝘏

0.07

펌

0.07

代言

0.07

쩌

0.06

®

0.06

ッション

0.06

Activations Density 0.093%

No Comments

No Known Activations