INDEX

Explanations

Random character combinations

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Marc

-0.10

ци

-0.09

enic

-0.09

Marc

-0.09

Rac

-0.09

 Narc

-0.08

ic

-0.08

dic

-0.08

IC

-0.08

 marc

-0.08

POSITIVE LOGITS

0.18

0.17

ow

0.17

OW

0.14

aw

0.14

.W

0.13

ew

0.13

0.12

DW

0.12

Activations Density 0.225%

Random character combinations

No Comments

No Known Activations

Random character combinations

No Comments

No Known Activations