INDEX

Explanations

Code segments

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_27/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.27.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 advancing

-0.07

 regrets

-0.07

 synaptic

-0.07

 scholarly

-0.06

Animation

-0.06

 window

-0.06

 Apps

-0.06

 Facilities

-0.06

 viewed

-0.06

 Rendering

-0.06

POSITIVE LOGITS

])):↵

0.07

 uninsured

0.07

 Kanun

0.07

 khỏ

0.07

洛

0.06

encodeURIComponent

0.06

 полот

0.06

radouro

0.06

SetTitle

0.06

†

0.06

Activations Density 0.011%

Code segments

No Comments

No Known Activations

Code segments

No Comments

No Known Activations