INDEX

Explanations

Initials/Names

np_max-act · gemini-2.0-flash

This neuron detects capitalized proper nouns, especially names of people and organizations.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 allocator

-0.06

شتر

-0.06

istically

-0.06

_val

-0.06

 Shapiro

-0.06

られ

-0.06

 embodiments

-0.06

.optimizer

-0.06

서관

-0.06

 housed

-0.06

POSITIVE LOGITS

sư

0.08

�

0.06

)m

0.06

 gymn

0.06

.SwingConstants

0.06

 jednot

0.06

infos

0.06

เทพ

0.06

 quilt

0.06

(instruction

0.06

Activations Density 0.090%

Initials/Names

This neuron detects capitalized proper nouns, especially names of people and organizations.

No Comments

No Known Activations

Initials/Names

This neuron detects capitalized proper nouns, especially names of people and organizations.

No Comments

No Known Activations