INDEX

Explanations

space

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

乗り

-0.08

 agreed

-0.07

oogle

-0.07

Won

-0.07

phinx

-0.07

Review

-0.07

行贿

-0.07

err

-0.07

 predecessor

-0.07

 start

-0.06

POSITIVE LOGITS

 devastation

0.07

 ************************

0.07

czę

0.07

 laughter

0.07

↵↵

0.07

 сильно

0.06

تكامل

0.06

.Adapter

0.06

_frac

0.06

_Handle

0.06

Activations Density 0.002%

space

No Comments

No Known Activations

space

No Comments

No Known Activations