INDEX

Explanations

math

np_max-act · gemini-2.0-flash

The neuron flags occurrences of “sum of” in arithmetic questions—that is, it detects when the text is asking to compute a sum.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

_rows

-0.08

West

-0.06

 inherits

-0.06

 People

-0.06

 Ваш

-0.06

 args

-0.06

-eyed

-0.06

aisy

-0.06

people

-0.06

 WHITE

-0.06

POSITIVE LOGITS

 водой

0.07

.getInfo

0.06

/mol

0.06

 remainder

0.06

 podařilo

0.06

/tcp

0.06

 stating

0.06

既

0.06

_rho

0.06

 شر

0.06

Activations Density 0.012%

math

The neuron flags occurrences of “sum of” in arithmetic questions—that is, it detects when the text is asking to compute a sum.

No Comments

No Known Activations

math

The neuron flags occurrences of “sum of” in arithmetic questions—that is, it detects when the text is asking to compute a sum.

No Comments

No Known Activations