INDEX

Explanations

code

np_max-act · gemini-2.0-flash

The neuron activates on the quoted command identifiers (e.g. “google”, “browse_website”, “start_agent”)—in other words, it spots the names of commands enclosed in double quotes.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

nen

-0.06

 Easter

-0.06

 Movies

-0.06

 خلق

-0.06

<Texture

-0.06

.offer

-0.06

atest

-0.06

 German

-0.06

 Daly

-0.06

Fu

-0.06

POSITIVE LOGITS

 Aires

0.06

 accordance

0.06

ливий

0.06

 guit

0.06

ENABLE

0.06

entin

0.06

.ACTION

0.06

ificados

0.06

 Break

0.06

 Beckham

0.06

Activations Density 0.004%

code

The neuron activates on the quoted command identifiers (e.g. “google”, “browse_website”, “start_agent”)—in other words, it spots the names of commands enclosed in double quotes.

No Comments

No Known Activations

code

The neuron activates on the quoted command identifiers (e.g. “google”, “browse_website”, “start_agent”)—in other words, it spots the names of commands enclosed in double quotes.

No Comments

No Known Activations