INDEX

Explanations

possibility/suggestion/hedging language

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_7/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.7.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 trembling

-0.07

KM

-0.06

brief

-0.06

lland

-0.06

 Гри

-0.06

маг

-0.06

jwt

-0.06

 parch

-0.06

 의해

-0.06

ferences

-0.06

POSITIVE LOGITS

 estimates

0.07

 работы

0.07

.multiply

0.07

 Osama

0.06

.player

0.06

BIG

0.06

 Overall

0.06

 Burn

0.06

 candle

0.06

.')↵

0.06

Activations Density 1.200%

possibility/suggestion/hedging language

No Comments

No Known Activations