INDEX

Explanations

Many

np_max-act · gemini-2.0-flash

The neuron fires on general-purpose quantifiers and vague discourse markers (e.g. “many,” “most,” “usually,” “something”).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

igate

-0.06

 понять

-0.06

.stream

-0.06

.ge

-0.06

 hereby

-0.06

 quindi

-0.06

 října

-0.06

 trem

-0.06

(menu

-0.06

 rigorous

-0.06

POSITIVE LOGITS

 Many

0.08

something

0.07

sparse

0.07

 Lots

0.07

 Last

0.07

 Limited

0.07

endencies

0.07

Alternative

0.07

 Doctors

0.07

Vý

0.07

Activations Density 0.090%

Many

The neuron fires on general-purpose quantifiers and vague discourse markers (e.g. “many,” “most,” “usually,” “something”).

No Comments

No Known Activations

Many

The neuron fires on general-purpose quantifiers and vague discourse markers (e.g. “many,” “most,” “usually,” “something”).

No Comments

No Known Activations