INDEX

Explanations

Arranging objects

np_max-act · gemini-2.0-flash

instructions or steps on how to stack items in a stable manner.

oai_token-act-pair · gpt-4o-mini Triggered by @xinyanhu8

This neuron activates on tokens in user requests or assistant replies that give or ask for step-by-step “stacking” instructions (e.g. “stack,” “stable,” “manner,” etc.).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

/lo

-0.07

 sinister

-0.06

lene

-0.06

 inhibition

-0.06

 hiring

-0.06

 whose

-0.06

 realizado

-0.06

 Payload

-0.06

αρ

-0.06

 avenues

-0.06

POSITIVE LOGITS

 rağmen

0.07

ycopg

0.07

_HAND

0.07

URIComponent

0.07

(ag

0.06

_COM

0.06

_FW

0.06

кувати

0.06

lli

0.06

Activations Density 0.011%

Arranging objects

instructions or steps on how to stack items in a stable manner.

This neuron activates on tokens in user requests or assistant replies that give or ask for step-by-step “stacking” instructions (e.g. “stack,” “stable,” “manner,” etc.).

No Comments

No Known Activations

Arranging objects

instructions or steps on how to stack items in a stable manner.

This neuron activates on tokens in user requests or assistant replies that give or ask for step-by-step “stacking” instructions (e.g. “stack,” “stable,” “manner,” etc.).

No Comments

No Known Activations