INDEX

Explanations

Condemnation of attacks

np_max-act · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_19/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.19.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

	input

-0.07

 آپ

-0.07

される

-0.07

 amplitude

-0.07

炮

-0.06

-lo

-0.06

一点

-0.06

ًا

-0.06

utos

-0.06

انه

-0.06

POSITIVE LOGITS

 rins

0.07

 objectForKey

0.06

.required

0.06

 reun

0.06

.Flag

0.06

�

0.06

Fe

0.06

Hem

0.06

▍

0.06

++++

0.06

Activations Density 0.059%

Condemnation of attacks

No Comments

No Known Activations

Condemnation of attacks

No Comments

No Known Activations