INDEX

Explanations

code/configuration files

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 snapshot

-0.08

茚

-0.08

Isl

-0.07

antt

-0.07

 diagnose

-0.07

 probable

-0.07

 ashamed

-0.07

 pastoral

-0.07

羔

-0.07

高血压

-0.07

POSITIVE LOGITS

_raise

0.07

もあり

0.06

的基础上

0.06

山东

0.06

.CreateCommand

0.06

าง

0.06

rog

0.06

结尾

0.06

孖

0.06

sterreich

0.06

Activations Density 0.000%

No Known Activations

This feature has no known activations.