INDEX

Explanations

Brevity or short summaries

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_23/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.23.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Timestamp

-0.07

Fully

-0.06

ูม

-0.06

Factor

-0.06

_written

-0.06

Crop

-0.06

 grown

-0.06

_rgba

-0.06

错误

-0.06

 pent

-0.06

POSITIVE LOGITS

 paranoia

0.08

 prá

0.07

 knocking

0.07

'nun

0.06

 chemical

0.06

sent

0.06

 полот

0.06

 ausp

0.06

 neigh

0.06

Highlights

0.06

Activations Density 0.210%

Brevity or short summaries

No Comments

No Known Activations