INDEX

Explanations

with

np_max-act · gemini-2.0-flash

The neuron detects promotional feature-highlighting language in real-estate listings (e.g. words introducing property amenities like “boasts,” “offers,” “with,” “featuring,” etc.).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 -------------------------------------------------------------------------

-0.07

しか

-0.06

 ninth

-0.06

经过

-0.06

沉

-0.06

 Citizen

-0.06

('/')

-0.06

 Guill

-0.06

arger

-0.06

lando

-0.06

POSITIVE LOGITS

 analogy

0.06

-HT

0.06

websocket

0.06

 backpack

0.06

ilater

0.06

 Vegetable

0.06

 Veterinary

0.06

alted

0.06

discover

0.06

(pipe

0.06

Activations Density 0.010%

with

The neuron detects promotional feature-highlighting language in real-estate listings (e.g. words introducing property amenities like “boasts,” “offers,” “with,” “featuring,” etc.).

No Comments

No Known Activations

with

The neuron detects promotional feature-highlighting language in real-estate listings (e.g. words introducing property amenities like “boasts,” “offers,” “with,” “featuring,” etc.).

No Comments

No Known Activations