INDEX

Explanations

multiple

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ysical

-0.07

.instance

-0.07

Artist

-0.07

史上

-0.07

(robot

-0.07

Convention

-0.07

.setTitle

-0.07

restaurants

-0.07

 psyche

-0.07

itions

-0.07

POSITIVE LOGITS

热线

0.08

.ApplyResources

0.07

]:
↵

0.07

居

0.07

köp

0.07

Hot

0.07

-*

0.07

_ANDROID

0.07

越来越高

0.06

 bring

0.06

Activations Density 0.001%

multiple

No Comments

No Known Activations