INDEX

Explanations

common words

np_max-act · gemini-2.0-flash

This neuron strongly activates on terms related to child-development stages and toddler behavior (e.g. child psychology, toddler, “The No Stage,” etc.).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Literary

-0.07

 Narrow

-0.07

 strips

-0.06

 princess

-0.06

 Lost

-0.06

estro

-0.06

getState

-0.06

Alf

-0.06

PIX

-0.06

 Laur

-0.06

POSITIVE LOGITS

ijing

0.07

 fayd

0.07

(cn

0.07

 OUTER

0.07

can

0.06

_BLOCKS

0.06

ких

0.06

 vyjád

0.06

_seg

0.06

.asInstanceOf

0.06

Activations Density 0.112%

common words

This neuron strongly activates on terms related to child-development stages and toddler behavior (e.g. child psychology, toddler, “The No Stage,” etc.).

No Comments

No Known Activations

common words

This neuron strongly activates on terms related to child-development stages and toddler behavior (e.g. child psychology, toddler, “The No Stage,” etc.).

No Comments

No Known Activations