INDEX

Explanations

-

np_max-act · gemini-2.0-flash

The neuron fires on numeric tokens—especially signed or decimal numbers—i.e. it detects number‐like entries in the text.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 groß

-0.07

課

-0.07

ίο

-0.06

 işte

-0.06

 різних

-0.06

γχ

-0.06

 nhờ

-0.06

height

-0.06

 breakfast

-0.06

 testimon

-0.06

POSITIVE LOGITS

 trans

0.07

buquerque

0.06

 Taco

0.06

 delic

0.06

 reins

0.06

 Crab

0.06

 NSMutable

0.06

 slur

0.06

 invitations

0.06

 taco

0.06

Activations Density 0.004%

-

The neuron fires on numeric tokens—especially signed or decimal numbers—i.e. it detects number‐like entries in the text.

No Comments

No Known Activations