INDEX

Explanations

Saving and protecting

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 inducing

-0.06

Ꮯ

-0.06

Disp

-0.06

ennent

-0.06

 aroused

-0.06

 아니

-0.06

风情

-0.06

芳香

-0.06

 Trap

-0.06

PLICATION

-0.06

POSITIVE LOGITS

_operation

0.08

耳朵

0.07

 отнош

0.07

 Rogers

0.07

젝

0.07

 Weld

0.07

法制

0.07

(td

0.06

mó

0.06

.azure

0.06

Activations Density 0.039%

Saving and protecting

No Comments

No Known Activations

Saving and protecting

No Comments

No Known Activations