INDEX

Explanations

short descriptions/information

np_max-act · gemini-2.0-flash

The neuron detects instruction words in the prompt that tell the model to generate or rewrite text—for example “write,” “short,” “description,” and “headline.”

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 pirate

-0.07

LCD

-0.07

-and

-0.06

 refs

-0.06

ViewPager

-0.06

gcd

-0.06

Whenever

-0.06

(embed

-0.06

(gc

-0.06

arged

-0.06

POSITIVE LOGITS

涉

0.06

 فرمان

0.06

 JSName

0.06

 approve

0.06

ΔE

0.06

dg

0.06

	super

0.06

 Loren

0.05

 donna

0.05

Activations Density 0.005%

short descriptions/information

The neuron detects instruction words in the prompt that tell the model to generate or rewrite text—for example “write,” “short,” “description,” and “headline.”

No Comments

No Known Activations