INDEX

Explanations

ingredients/components lists

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ﮎ

-0.08

 vatandaş

-0.07

私立

-0.07

مناطق

-0.07

 deserves

-0.07

dõ

-0.07

erala

-0.07

蔬

-0.07

 seçim

-0.07

 yık

-0.06

POSITIVE LOGITS

 CATEGORY

0.07

 conference

0.07

SB

0.07

𝕂

0.07

全覆盖

0.06

NON

0.06

aine

0.06

Reflection

0.06

 shield

0.06

延

0.06

Activations Density 0.010%

ingredients/components lists

No Comments

No Known Activations