INDEX

Explanations

hands

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_7/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.7.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ekkür

-0.07

 paving

-0.07

 nexus

-0.06

.SerializeObject

-0.06

าประ

-0.06

ادة

-0.06

 Lobby

-0.06

-Cola

-0.06

(ro

-0.06

건

-0.06

POSITIVE LOGITS

 toxicity

0.07

 Efficient

0.07

 Dominican

0.06

iselect

0.06

 است

0.06

limit

0.06

母

0.06

 составе

0.06

(","

0.06

 supervision

0.06

Activations Density 0.004%

hands

No Comments

No Known Activations