INDEX

Explanations

attenu

np_max-act · gemini-2.0-flash

The neuron detects verbs (and related forms) that describe therapeutic or inhibitory actions—words like “suppresses,” “attenuates,” “inhibits,” “prevents,” etc., signaling reduction of a pathological process.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 teil

-0.07

 цен

-0.06

.Assign

-0.06

 imprisonment

-0.06

 поруш

-0.06

hes

-0.06

urtle

-0.06

_deep

-0.06

 túi

-0.06

POSITIVE LOGITS

ABC

0.07

 difficulty

0.07

 evidently

0.07

Unc

0.07

scriptions

0.06

има

0.06

-coordinate

0.06

 Cord

0.06

ORD

0.06

-associated

0.06

Activations Density 0.039%

attenu

The neuron detects verbs (and related forms) that describe therapeutic or inhibitory actions—words like “suppresses,” “attenuates,” “inhibits,” “prevents,” etc., signaling reduction of a pathological process.

No Comments

No Known Activations