INDEX

Explanations

Lists of varied items

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_23/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.23.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

胳膊

-0.08

Aj

-0.07

 sparked

-0.07

أفر

-0.07

ヒ

-0.07

อนา

-0.07

مدن

-0.07

生态环境

-0.07

Ipv

-0.07

โด

-0.07

POSITIVE LOGITS

`]

0.07

contain

0.07

鿎

0.07

 приним

0.07

•↵↵

0.06

 downloadable

0.06

坚实

0.06

ﶈ

0.06

uest

0.06

	Public

0.06

Activations Density 0.008%

Lists of varied items

No Comments

No Known Activations

Lists of varied items

No Comments

No Known Activations