INDEX

Explanations

URL encoding and characters

np_max-act-logits · gemini-2.0-flash

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_27/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.27.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 bourgeois

-0.07

Streamer

-0.07

 bananas

-0.07

 adjoining

-0.07

Uniform

-0.07

 Sonra

-0.06

数据

-0.06

db

-0.06

 converged

-0.06

	format

-0.06

POSITIVE LOGITS

_PED

0.06

PT

0.06

pep

0.06

","");↵

0.06

 withd

0.06

wegian

0.06

 تخصص

0.06

_CAR

0.06

 Completion

0.06

Activations Density 0.019%

URL encoding and characters

No Comments

No Known Activations

URL encoding and characters

No Comments

No Known Activations