INDEX

Explanations

a and an

np_max-act · gemini-2.0-flash

This neuron responds to definitional or descriptive “is/are a …” phrases, marking when the text expresses that something “is a [noun]” or “are a [noun]” (e.g. “are a valuable tool”).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Ids

-0.09

XML

-0.09

hits

-0.08

TS

-0.08

 DAYS

-0.08

 Abrams

-0.07

 Days

-0.07

 expos

-0.07

аются

-0.07

POSITIVE LOGITS

explained

0.06

 []↵↵↵

0.06

 necklace

0.06

-haired

0.06

/world

0.06

 Celebr

0.06

 childhood

0.06

climate

0.06

 ความ

0.06

vô

0.06

Activations Density 0.074%

a and an

This neuron responds to definitional or descriptive “is/are a …” phrases, marking when the text expresses that something “is a [noun]” or “are a [noun]” (e.g. “are a valuable tool”).

No Comments

No Known Activations

a and an

This neuron responds to definitional or descriptive “is/are a …” phrases, marking when the text expresses that something “is a [noun]” or “are a [noun]” (e.g. “are a valuable tool”).

No Comments

No Known Activations