INDEX

Explanations

beginning of stories

np_max-act · gemini-2.0-flash

The neuron activates on chunks that describe a character’s immediate setting and current state or actions (e.g., “You’re in bed,” “NAME_1 is at the pub,” “You’re bored and feeling horny,” “You’re about to message me…”).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 reshape

-0.07

 Burk

-0.06

舌

-0.06

.Pin

-0.06

 Israeli

-0.06

 Fiber

-0.06

 Skinner

-0.06

 broth

-0.06

resas

-0.06

 Kelly

-0.06

POSITIVE LOGITS

 Concern

0.09

_packages

0.06

 messed

0.06

.student

0.06

{@

0.06

.ali

0.06

 milyon

0.06

환경

0.06

rient

0.06

<%=

0.06

Activations Density 0.087%

beginning of stories

The neuron activates on chunks that describe a character’s immediate setting and current state or actions (e.g., “You’re in bed,” “NAME_1 is at the pub,” “You’re bored and feeling horny,” “You’re about to message me…”).

No Comments

No Known Activations

beginning of stories

The neuron activates on chunks that describe a character’s immediate setting and current state or actions (e.g., “You’re in bed,” “NAME_1 is at the pub,” “You’re bored and feeling horny,” “You’re about to message me…”).

No Comments

No Known Activations