INDEX

Explanations

up/build

np_max-act · gemini-2.0-flash

The neuron is primarily triggered by the term “build-up” (and its subparts like “build” + “up”), i.e. mentions of accumulation.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 яс

-0.07

щее

-0.07

 Shan

-0.06

 rescue

-0.06

 relativ

-0.06

esto

-0.06

 också

-0.06

annel

-0.06

 Integral

-0.06

ắ

-0.06

POSITIVE LOGITS

---------↵↵

0.07

 Wright

0.06

package

0.06

 بال

0.06

Bai

0.06

 PreparedStatement

0.06

 objectType

0.06

 exagger

0.06

_HISTORY

0.06

 buildup

0.06

Activations Density 0.007%

up/build

The neuron is primarily triggered by the term “build-up” (and its subparts like “build” + “up”), i.e. mentions of accumulation.

No Comments

No Known Activations

up/build

The neuron is primarily triggered by the term “build-up” (and its subparts like “build” + “up”), i.e. mentions of accumulation.

No Comments

No Known Activations