INDEX

Explanations

and

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_27/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.27.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 mayo

-0.07

 releasing

-0.06

 AttributeSet

-0.06

 Поп

-0.06

��

-0.06

 milit

-0.06

 hacks

-0.06

,…↵↵

-0.06

 hype

-0.06

 width

-0.06

POSITIVE LOGITS

sizlik

0.07

 demeanor

0.06

sense

0.06

memiş

0.06

 setDefaultCloseOperation

0.06

SAR

0.06

ydk

0.06

 předsed

0.06

 ефектив

0.06

puted

0.06

Activations Density 0.311%

and

No Comments

No Known Activations

and

No Comments

No Known Activations