INDEX

Explanations

pen

np_max-act · gemini-2.0-flash

The neuron selectively activates on occurrences of the substring “pen,” whether as the standalone token or as part of words like “penicillin.”

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

更

-0.07

_HIGH

-0.07

 segregation

-0.07

arda

-0.07

arcy

-0.06

high

-0.06

.Static

-0.06

\Validator

-0.06

_RA

-0.06

.MixedReality

-0.06

POSITIVE LOGITS

Pen

0.15

pen

0.14

Pen

0.10

 pens

0.09

 pencil

0.09

pen

0.09

 penned

0.09

_pen

0.08

 penc

0.08

.pen

0.08

Activations Density 0.012%

pen

The neuron selectively activates on occurrences of the substring “pen,” whether as the standalone token or as part of words like “penicillin.”

No Comments

No Known Activations