INDEX

Explanations

No Explanations Found

New Auto-Interp

Configuration

andyrdt/saes-qwen2.5-7b-instruct/resid_post_layer_15/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.15.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

sons

-0.09

Img

-0.07

מוש

-0.07

谤

-0.07

streams

-0.07

箱

-0.07

lobs

-0.07

 Tube

-0.07

 Watson

-0.07

shiv

-0.06

POSITIVE LOGITS

 полностью

0.07

更换

0.07

💺

0.07

_compat

0.07

 impartial

0.07

	report

0.07

 FETCH

0.07

 הפר

0.07

 unpredict

0.07

iliary

0.07

Activations Density 0.191%

No Comments

No Known Activations