INDEX

Explanations

Scientific papers

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_7/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.7.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Queens

-0.07

;',

-0.07

atori

-0.06

otoxic

-0.06

 Buchanan

-0.06

 ",
↵

-0.06

 "}↵

-0.06

 abbreviated

-0.06

opening

-0.06

roles

-0.06

POSITIVE LOGITS

 kurulan

0.07

 SOFTWARE

0.07

énom

0.06

 DAMAGES

0.06

 oldukları

0.06

 sincerely

0.06

 Söz

0.06

 taxpayer

0.06

 Panasonic

0.06

 розрах

0.06

Activations Density 0.002%

Scientific papers

No Comments

No Known Activations

Scientific papers

No Comments

No Known Activations