INDEX

Explanations

movies and books

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_27/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.27.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-common

-0.07

contracts

-0.07

IU

-0.07

 besides

-0.06

arsity

-0.06

'nın

-0.06

 restroom

-0.06

 mistress

-0.06

IDEOS

-0.06

 fullscreen

-0.06

POSITIVE LOGITS

plementary

0.06

�

0.06

 Coming

0.06

$file

0.06

Пр

0.06

 brushes

0.06

кость

0.06

 timed

0.06

 Absolutely

0.06

 кол

0.06

Activations Density 0.040%

movies and books

No Comments

No Known Activations

movies and books

No Comments

No Known Activations