INDEX

Explanations

login

np_max-act · gemini-2.0-flash

Explanation of neuron 4 behavior: the main thing this neuron does is find numerical tokens (digits or numbers).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 trainers

-0.07

 Withdraw

-0.07

demo

-0.06

 стара

-0.06

 discriminator

-0.06

igua

-0.06

ath

-0.06

 bells

-0.06

 bien

-0.06

較

-0.06

POSITIVE LOGITS

=False

0.08

GROUND

0.07

Bron

0.07

 frightening

0.07

 FRONT

0.06

.Identifier

0.06

aped

0.06

 Usually

0.06

olean

0.06

atisfied

0.06

Activations Density 0.015%

login

Explanation of neuron 4 behavior: the main thing this neuron does is find numerical tokens (digits or numbers).

No Comments

No Known Activations

login

Explanation of neuron 4 behavior: the main thing this neuron does is find numerical tokens (digits or numbers).

No Comments

No Known Activations