INDEX

Explanations

code

np_max-act · gemini-2.0-flash

This neuron detects placeholder instruction fragments—specifically the “[ insert … here ]”‐style tokens used to mark where a substitution should go.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

appeared

-0.08

HOH

-0.07

antidad

-0.07

exus

-0.06

ethe

-0.06

LES

-0.06

 Payne

-0.06

ばかり

-0.06

too

-0.06

ázev

-0.06

POSITIVE LOGITS

 Viewer

0.07

 trứng

0.06

 quad

0.06

	packet

0.06

,col

0.06

-->
↵

0.06

тесь

0.06

 grammar

0.06

exclude

0.06

 converged

0.06

Activations Density 0.012%

code

This neuron detects placeholder instruction fragments—specifically the “[ insert … here ]”‐style tokens used to mark where a substitution should go.

No Comments

No Known Activations