INDEX

Explanations

Unusual characters

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Silk

-0.07

 Exhibition

-0.06

 Chef

-0.06

Blue

-0.06

lder

-0.06

 development

-0.06

ıklı

-0.06

 hjem

-0.06

 Each

-0.06

 providing

-0.06

POSITIVE LOGITS

终点

0.07

歷史

0.07

☿

0.07

芫

0.07

鬧

0.07

Mn

0.07

sne

0.07

Intl

0.07

 Verg

0.06

 =================================================================================

0.06

Activations Density 0.007%

Unusual characters

No Comments

No Known Activations