INDEX

Explanations

=>

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_27/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.27.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Clients

-0.07

 nouns

-0.07

蓝

-0.06

_verification

-0.06

 separ

-0.06

 legend

-0.06

	class

-0.06

 Adams

-0.06

IPs

-0.06

email

-0.06

POSITIVE LOGITS

 apartheid

0.07

 collectively

0.06

 JSBracketAccess

0.06

ательно

0.06

upuncture

0.06

γκε

0.06

 申博

0.06

.ReadToEnd

0.06

โท

0.06

tos

0.06

Activations Density 0.002%

=>

No Comments

No Known Activations