INDEX

Explanations

equals signs

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_7/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.7.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Ji

-0.07

 unveil

-0.07

-0.06

mơ

-0.06

dfd

-0.06

 Fans

-0.06

”,

-0.06

隐藏

-0.06

Load

-0.06

 σκο

-0.06

POSITIVE LOGITS

================================================================

0.13

.typ

0.07

 محصولات

0.07

[col

0.06

 gathers

0.06

 =================================================================

0.06

-ind

0.06

очка

0.06

lex

0.06

.getElements

0.06

Activations Density 0.001%

equals signs

No Comments

No Known Activations

equals signs

No Comments

No Known Activations