INDEX

Explanations

Comparing items

np_max-act · gemini-2.0-flash

The neuron activates on instructional or “how-to” directives—i.e. imperative verbs and guidance steps in technical explanations.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

−

-0.07

 avere

-0.07

-0.06

rades

-0.06

ī

-0.06

تم

-0.06

sah

-0.06

 blends

-0.06

 Bars

-0.06

лим

-0.06

POSITIVE LOGITS

 tranny

0.08

 culmination

0.07

Secretary

0.07

 Alternatively

0.07

 Dropbox

0.06

upply

0.06

Equip

0.06

 congressman

0.06

८

0.06

 Text

0.06

Activations Density 0.052%

Comparing items

The neuron activates on instructional or “how-to” directives—i.e. imperative verbs and guidance steps in technical explanations.

No Comments

No Known Activations

Comparing items

The neuron activates on instructional or “how-to” directives—i.e. imperative verbs and guidance steps in technical explanations.

No Comments

No Known Activations