INDEX

Explanations

abbreviations

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

SimpleName

-0.07

θούν

-0.06

FirstChild

-0.06

 ".";↵

-0.06

 derece

-0.06

 Kathryn

-0.06

Rad

-0.06

Richard

-0.06

 Παρ

-0.06

 мереж

-0.06

POSITIVE LOGITS

نسا

0.07

しまう

0.07

ụy

0.07

astics

0.07

.WindowManager

0.07

ZW

0.06

_pago

0.06

 occupation

0.06

_resume

0.06

 зав

0.06

Activations Density 0.004%

abbreviations

No Comments

No Known Activations

abbreviations

No Comments

No Known Activations