INDEX

Explanations

mountains

np_max-act · gemini-2.0-flash

This neuron fires on mentions of mountains—the word “mountain” itself or the names of specific peaks.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Section

-0.06

 Volunteer

-0.06

 Authority

-0.06

 irrig

-0.06

‌ی

-0.06

Ek

-0.06

 territories

-0.06

.material

-0.06

 controlling

-0.06

/github

-0.06

POSITIVE LOGITS

 آلة

0.07

_take

0.06

incl

0.06

แพร

0.06

racuse

0.06

ickname

0.06

October

0.06

 частина

0.06

圭圭

0.06

・・・

0.06

Activations Density 0.014%

mountains

This neuron fires on mentions of mountains—the word “mountain” itself or the names of specific peaks.

No Comments

No Known Activations