INDEX

Explanations

startup

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

#================================================================

-0.08

⎛

-0.07

썼

-0.07

 prés

-0.07

�

-0.07

 begins

-0.07

wünsche

-0.07

おく

-0.07

<num

-0.06

 userType

-0.06

POSITIVE LOGITS

 startup

0.09

 psychedelic

0.08

省教育厅

0.07

歧视

0.07

 технологии

0.07

 insurgency

0.07

_dispatcher

0.07

uru

0.07

 nhiễ

0.07

戆

0.07

Activations Density 0.004%

startup

No Comments

No Known Activations