INDEX

Explanations

panel

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 bankrupt

-0.08

itic

-0.08

 Pacific

-0.08

 Chuck

-0.07

ﳋ

-0.07

 России

-0.07

 Vacc

-0.07

 Hữu

-0.07

 Druid

-0.07

兮

-0.07

POSITIVE LOGITS

(style

0.07

	property

0.07

small

0.07

.presentation

0.07

justify

0.07

par

0.07

Definition

0.07

perform

0.07

_erase

0.07

 collectionView

0.07

Activations Density 0.027%

panel

No Comments

No Known Activations

panel

No Comments

No Known Activations