INDEX

Explanations

Casual text

np_max-act · gemini-2.0-flash

The neuron spikes on the document’s author-voice or opinion phrases (e.g. “I …,” “it is obvious,” “office,” “working there”), i.e. self-referential statements and subjective commentary.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

office

np_max-act · gemini-2.5-flash Triggered by @aardvarkkr

The neuron activates on pronouns, possessive adjectives, and conjunctions when referring to people, often in the context of personal involvement or identity.

oai_token-act-pair · gemini-2.5-flash Triggered by @aardvarkkr

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

•my list

No Comments

Negative Logits

servers

-0.07

FromDate

-0.06

 -------↵

-0.06

_joint

-0.06

 thêm

-0.06

$query

-0.06

pp

-0.06

Particles

-0.06

 chops

-0.06

מ

-0.06

POSITIVE LOGITS

Lud

0.07

";"

0.06

-dollar

0.06

 tidak

0.06

 велик

0.06

 Gentle

0.06

 dereg

0.06

„P

0.06

overe

0.06

 encuentra

0.06

Activations Density 0.103%

Casual text

The neuron spikes on the document’s author-voice or opinion phrases (e.g. “I …,” “it is obvious,” “office,” “working there”), i.e. self-referential statements and subjective commentary.

office

The neuron activates on pronouns, possessive adjectives, and conjunctions when referring to people, often in the context of personal involvement or identity.

No Comments

No Known Activations