INDEX

Explanations

ic

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_27/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.27.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 jika

-0.06

 maintained

-0.06

ühr

-0.06

 sketches

-0.06

dav

-0.06

акс

-0.06

rin

-0.06

 controlling

-0.06

 simplicity

-0.06

actoring

-0.06

POSITIVE LOGITS

']=="

0.07

oteca

0.07

Theo

0.06

:checked

0.06

θι

0.06

�от

0.06

 посл

0.06

asper

0.06

_DIPSETTING

0.06

 chromium

0.06

Activations Density 0.005%

ic

No Comments

No Known Activations