INDEX

Explanations

surround

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

mwhanna/qwen3-4b-transcoders/layer_4.safetensors

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Features

163,840

Data Type

float32

Hook Name

blocks.4.mlp.hook_in

Architecture

transcoder

Context Size

8,192

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

position

-0.28

æĬµ

-0.26

æĬ¢åħĪ

-0.26

Slo

-0.25

à¸łà¸²à¸ŀ

-0.25

 china

-0.24

å¥½åĲĥ

-0.24

 execution

-0.24

POSITION

-0.24

éĹ´éļĶ

-0.24

POSITIVE LOGITS

 INCIDENT

0.33

 Incident

0.27

 explo

0.26

olla

0.26

 surrounding

0.26

 ranger

0.25

iscard

0.25

.Ent

0.25

 adulti

0.24

oe

0.24

Activations Density 0.016%

surround

No Comments

No Known Activations

surround

No Comments

No Known Activations