INDEX

Explanations

violence and restraint

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

difference

-0.08

ייע

-0.07

惘

-0.07

素晴

-0.07

 Tradable

-0.07

apeutics

-0.07

 PROCUREMENT

-0.07

 itertools

-0.07

},{↵

-0.07

^{-

-0.07

POSITIVE LOGITS

 dell

0.07

 digestive

0.07

 vatanda

0.07

 tắm

0.07

Dragging

0.06

这样说

0.06

':

0.06

 thirty

0.06

瞻

0.06

-dev

0.06

Activations Density 0.009%

violence and restraint

No Comments

No Known Activations