INDEX

Explanations

research studies

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

mwhanna/qwen3-4b-transcoders/layer_23.safetensors

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Features

163,840

Data Type

float32

Hook Name

blocks.23.mlp.hook_in

Architecture

transcoder

Context Size

8,192

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

è·ª

-0.27

ä¸Ģéĥ¨

-0.25

idenav

-0.25

Enumer

-0.25

'&#

-0.24

ÑĩÑĮ

-0.24

ipi

-0.24

AccessType

-0.24

 Gale

-0.23

edin

-0.23

POSITIVE LOGITS

è®¤ä¸º

0.39

éĥ½è®¤ä¸º

0.36

 believe

0.34

 believes

0.33

åĽłæŃ¤

0.32

marvin

0.31

 therefore

0.30

 thinks

0.29

 hopes

0.27

 said

0.27

Activations Density 0.001%

research studies

No Comments

No Known Activations