INDEX

Explanations

,

np_max-act · gemini-2.0-flash

The neuron fires on sentence- or clause-initial context-setting words or phrases (e.g. adverbs like “Pathologically,” “Mechanically,” “Clinically,” or prepositional leads like “On deck,” “In vitro,” etc.) that launch a new statement.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 landscapes

-0.08

 distributed

-0.07

 magnetic

-0.07

污

-0.07

]='

-0.06

      ↵      ↵

-0.06

冷

-0.06

.'/

-0.06

 nového

-0.06

 stitched

-0.06

POSITIVE LOGITS

різ

0.07

-controls

0.07

antis

0.07

ibus

0.06

athroom

0.06

.escape

0.06

 مق

0.06

Eis

0.06

iams

0.06

Stopped

0.06

Activations Density 0.042%

,

The neuron fires on sentence- or clause-initial context-setting words or phrases (e.g. adverbs like “Pathologically,” “Mechanically,” “Clinically,” or prepositional leads like “On deck,” “In vitro,” etc.) that launch a new statement.

No Comments

No Known Activations