INDEX

Explanations

percentages

np_max-act · gemini-2.0-flash

The main thing this neuron does is detect numeric quantitative expressions (counts, percentages, rates) in the text.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

_factor

-0.06

 Honor

-0.06

 Apprentice

-0.06

.Build

-0.06

 gens

-0.06

.rep

-0.06

ать

-0.06

Immediate

-0.06

_endpoint

-0.06

.sk

-0.06

POSITIVE LOGITS

 propositions

0.07

\`

0.06

_PRI

0.06

 nejd

0.06

 inferior

0.06

 canActivate

0.06

 newData

0.06

».

0.06

 ماي

0.06

 unfavor

0.06

Activations Density 0.028%

percentages

The main thing this neuron does is detect numeric quantitative expressions (counts, percentages, rates) in the text.

No Comments

No Known Activations