INDEX

Explanations

resignations and refusals

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_15/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.15.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 đời

-0.06

Sarah

-0.06

(so

-0.06

脸

-0.06

 expres

-0.06

 voted

-0.06

 niece

-0.06

AX

-0.06

speed

-0.06

MSS

-0.06

POSITIVE LOGITS

 opens

0.07

 Potential

0.07

setState

0.07

setPosition

0.07

.open

0.06

идент

0.06

 제공

0.06

 UIStoryboardSegue

0.06

 estaba

0.06

 baptism

0.06

Activations Density 0.051%

resignations and refusals

No Comments

No Known Activations

resignations and refusals

No Comments

No Known Activations