INDEX

Explanations

answers, explanations

np_max-act · gemini-2.0-flash

The neuron detects when the user asks for an explanation “to a ⟨number⟩‐year‐old,” i.e. age-specified explanation requests.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

requests for, and responses giving, simplified kid-friendly explanations aimed at very young children (e.g., “explain like I’m five”).

oai_token-act-pair · gpt-5 Triggered by @vetterc0

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 camps

-0.07

 slave

-0.06

icipation

-0.06

 dungeon

-0.06

umno

-0.06

.query

-0.06

 unnecessary

-0.06

“No

-0.06

'%

-0.06

 comparing

-0.06

POSITIVE LOGITS

вав

0.07

 örg

0.06

 piger

0.06

联合

0.06

_CONNECTED

0.06

ícia

0.06

 IonicModule

0.06

nez

0.06

 khảo

0.06

zd

0.06

Activations Density 0.027%

answers, explanations

The neuron detects when the user asks for an explanation “to a ⟨number⟩‐year‐old,” i.e. age-specified explanation requests.

requests for, and responses giving, simplified kid-friendly explanations aimed at very young children (e.g., “explain like I’m five”).

No Comments

No Known Activations