INDEX

Explanations

modification

np_max-act · gemini-2.0-flash

The neuron specifically detects occurrences of the “modif-” stem (e.g. modify, modification, modifying).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 ecosystem

-0.08

urray

-0.08

 Eagle

-0.07

 culp

-0.07

sea

-0.07

 Campbell

-0.07

 east

-0.06

 surpass

-0.06

 circum

-0.06

 EnumerableStream

-0.06

POSITIVE LOGITS

 Modified

0.10

Mod

0.09

modifier

0.09

 modification

0.09

 modifier

0.09

 modified

0.09

modification

0.09

_MOD

0.09

mod

0.08

 modifications

0.08

Activations Density 0.027%

modification

The neuron specifically detects occurrences of the “modif-” stem (e.g. modify, modification, modifying).

No Comments

No Known Activations

modification

The neuron specifically detects occurrences of the “modif-” stem (e.g. modify, modification, modifying).

No Comments

No Known Activations