INDEX

Explanations

associated

np_max-act · gemini-2.0-flash

connections between health-related terms and the impact of physical characteristics on medical conditions.

oai_token-act-pair · gpt-4o-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

'Connor

-0.07

 PHONE

-0.07

_FRAME

-0.06

��

-0.06

훈

-0.06

Pil

-0.06

 Slots

-0.06

Browse

-0.06

윤

-0.06

	email

-0.06

POSITIVE LOGITS

uego

0.07

Que

0.07

 (){↵

0.07

город

0.06

ΑΡ

0.06

icion

0.06

ــــــــ

0.06

cision

0.06

!!!

0.06

.setChecked

0.06

Activations Density 0.045%

associated

connections between health-related terms and the impact of physical characteristics on medical conditions.

No Comments

No Known Activations

associated

connections between health-related terms and the impact of physical characteristics on medical conditions.

No Comments

No Known Activations