INDEX

Explanations

Names, especially "Charles"

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_27/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.27.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 tastes

-0.07

 conex

-0.07

_inc

-0.06

Faces

-0.06

dsa

-0.06

scalar

-0.06

 choke

-0.06

cdb

-0.06

ioc

-0.06

 Fucking

-0.06

POSITIVE LOGITS

elight

0.06

 jeho

0.06

_padding

0.06

�

0.06

-cert

0.06

 правил

0.06

 object

0.06

kę

0.06

ламент

0.06

 организ

0.06

Activations Density 0.009%

Names, especially "Charles"

No Comments

No Known Activations

Names, especially "Charles"

No Comments

No Known Activations