INDEX

Explanations

negation

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_23/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.23.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Compression

-0.07

Wag

-0.07

 جن

-0.07

"S

-0.06

 afar

-0.06

kg

-0.06

guide

-0.06

Vac

-0.06

Secure

-0.06

‬

-0.06

POSITIVE LOGITS

 zlat

0.07

_DISABLED

0.07

 akci

0.06

 consectetur

0.06

****************

0.06

InterruptedException

0.06

북도

0.06

 destinationViewController

0.06

 Python

0.06

.bukkit

0.06

Activations Density 0.026%

negation

No Comments

No Known Activations

negation

No Comments

No Known Activations