INDEX

Explanations

debunk

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

iance

-0.07

.lst

-0.06

 opcion

-0.06

 rollout

-0.06

.output

-0.06

 Hastings

-0.06

ldre

-0.06

hte

-0.06

(dead

-0.06

Tut

-0.06

POSITIVE LOGITS

 debunk

0.09

 meth

0.07

 figuring

0.07

 searching

0.07

 overcoming

0.07

การส

0.06

 seab

0.06

(Messages

0.06

 interacting

0.06

.vars

0.06

Activations Density 0.014%

debunk

No Comments

No Known Activations