INDEX

Explanations

interpersonal/sexual encounters

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_15/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.15.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

opal

-0.07

 thiết

-0.07

 reliably

-0.07

ут

-0.06

 transformations

-0.06

 تعد

-0.06

vice

-0.06

 Kore

-0.06

 locale

-0.06

 directory

-0.06

POSITIVE LOGITS

xhr

0.07

RequestMapping

0.07

 customs

0.06

(angle

0.06

aft

0.06

 evidently

0.06

نوع

0.06

(cs

0.06

:A

0.06

klär

0.06

Activations Density 0.033%

interpersonal/sexual encounters

No Comments

No Known Activations