INDEX

Explanations

Biological mechanisms

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-general

-0.07

 Adventure

-0.07

说不定

-0.07

ért

-0.07

🥦

-0.07

に関する

-0.07

дель

-0.07

陆军

-0.06

 McDonald

-0.06

 Platt

-0.06

POSITIVE LOGITS

"،

0.08

_PHASE

0.07

odied

0.07

量

0.07

 simil

0.07

трат

0.07

 Turns

0.07

iscrimination

0.07

.Est

0.07

 строк

0.07

Activations Density 0.077%

Biological mechanisms

No Comments

No Known Activations

Biological mechanisms

No Comments

No Known Activations