INDEX

Explanations

online resources and information

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Theo

-0.07

 hope

-0.07

 Gefühl

-0.07

抉择

-0.07

청소

-0.07

 désormais

-0.07

此刻

-0.07

 performing

-0.06

⸽

-0.06

 compliance

-0.06

POSITIVE LOGITS

芳

0.07

itemId

0.07

 wont

0.07

 antibodies

0.07

辈

0.07

 sharpen

0.07

caf

0.07

_AB

0.07

ernet

0.07

romosome

0.07

Activations Density 0.112%

online resources and information

No Comments

No Known Activations

online resources and information

No Comments

No Known Activations