INDEX

Explanations

interpersonal relationships

np_max-act · gemini-2.0-flash

emotional expressions and gestures in romantic contexts.

oai_token-act-pair · gpt-4o-mini Triggered by @xinyanhu8

The neuron is detecting speaker‐turn labels and character identifiers in the dialogue (e.g. tokens like NAME_1, NAME_2, and header/ID markers).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

trs

-0.07

,j

-0.07

 oggi

-0.07

={}

-0.07

ВС

-0.07

 برخ

-0.06

IVEN

-0.06

人の

-0.06

ốc

-0.06

orage

-0.06

POSITIVE LOGITS

 добавить

0.07

.Controls

0.07

 ----------↵

0.07

ड

0.06

START

0.06

 banning

0.06

 noci

0.06

 вещ

0.06

iropr

0.06

 creds

0.06

Activations Density 0.045%

interpersonal relationships

emotional expressions and gestures in romantic contexts.

The neuron is detecting speaker‐turn labels and character identifiers in the dialogue (e.g. tokens like NAME_1, NAME_2, and header/ID markers).

No Comments

No Known Activations