INDEX

Explanations

say the ending -osing, -oses, or -os

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_27/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.27.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Granny

-0.08

 Label

-0.07

etak

-0.07

 curls

-0.06

 sleep

-0.06

 sentenced

-0.06

 performance

-0.06

 sandwiches

-0.06

pů

-0.06

 vědom

-0.06

POSITIVE LOGITS

یکی

0.07

oined

0.06

 adolescents

0.06

_PRO

0.06

 populated

0.06

PerPixel

0.06

мага

0.06

ジ

0.06

」

0.06

 ان

0.06

Activations Density 0.003%

say the ending -osing, -oses, or -os

No Comments

No Known Activations

say the ending -osing, -oses, or -os

No Comments

No Known Activations