INDEX

Explanations

warnings and disclaimers

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

멀

-0.09

.CH

-0.08

 membr

-0.07

 retir

-0.07

 model

-0.07

My

-0.07

/send

-0.07

 meter

-0.07

put

-0.07

MW

-0.07

POSITIVE LOGITS

تقارير

0.07

ประชา

0.07

外围

0.07

 Cypress

0.07

剧情

0.06

 Reign

0.06

 сохран

0.06

 RoundedRectangle

0.06

gba

0.06

 dokładnie

0.06

Activations Density 0.019%

warnings and disclaimers

No Comments

No Known Activations