INDEX

Explanations

is

np_max-act · gemini-2.0-flash

The neuron detects the common “is a” phrase used in introductory statements—e.g. after a company name when defining what the entity “is a” something.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 twilight

-0.06

 ripe

-0.06

cet

-0.06

-tax

-0.06

nish

-0.06

practice

-0.06

wid

-0.05

sweet

-0.05

았

-0.05

_NR

-0.05

POSITIVE LOGITS

REAM

0.08

籍

0.07

cone

0.07

 wired

0.07

.toCharArray

0.07

 cellForRowAtIndexPath

0.06

ویی

0.06

.XRLabel

0.06

 จำก

0.06

ный

0.06

Activations Density 0.037%

is

The neuron detects the common “is a” phrase used in introductory statements—e.g. after a company name when defining what the entity “is a” something.

No Comments

No Known Activations

is

The neuron detects the common “is a” phrase used in introductory statements—e.g. after a company name when defining what the entity “is a” something.

No Comments

No Known Activations