INDEX

Explanations

comma

np_max-act · gemini-2.0-flash

boilerplate assistant-safe/respectful preface phrases (e.g., "As a helpful and respectful assistant, I’m happy to…").

oai_token-act-pair · gpt-5-mini Triggered by @vetterc0

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_7/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.7.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 readOnly

-0.06

smtp

-0.06

'},

-0.06

hidden

-0.06

 rval

-0.06

tts

-0.06

 Antar

-0.06

ctr

-0.06

غر

-0.05

Wildcard

-0.05

POSITIVE LOGITS

/src

0.07

amientos

0.07

>>↵

0.07

 уклад

0.07

砲

0.07

億

0.06

-fix

0.06

covered

0.06

====↵

0.06

 Anatomy

0.06

Activations Density 0.010%

comma

boilerplate assistant-safe/respectful preface phrases (e.g., "As a helpful and respectful assistant, I’m happy to…").

No Comments

No Known Activations