INDEX

Explanations

Repetitive or nonsensical texts

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Hun

-0.08

.Magenta

-0.07

.Cart

-0.07

えない

-0.06

Abe

-0.06

 TestData

-0.06

уб

-0.06

Mag

-0.06

icas

-0.06

KER

-0.06

POSITIVE LOGITS

licit

0.09

셉

0.07

栽

0.07

透气

0.07

_first

0.07

ISTICS

0.07

 Brian

0.07

服装

0.07

cq

0.06

 vowel

0.06

Activations Density 0.008%

Repetitive or nonsensical texts

No Comments

No Known Activations

Repetitive or nonsensical texts

No Comments

No Known Activations