INDEX

Explanations

to

np_max-act · gemini-2.0-flash

This neuron detects phrases expressing a personal proposal or intention—particularly first-person (“I would like to…,” “I believe we could…”) statements used when making a request or offering a collaboration.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

튼

-0.08

[max

-0.06

 Remark

-0.06

RCS

-0.06

Custom

-0.06

Stage

-0.06

WK

-0.06

 minden

-0.06

ohen

-0.05

_log

-0.05

POSITIVE LOGITS

">';↵

0.07

istické

0.07

*>(&

0.07

 #↵↵

0.07

());↵

0.06

crud

0.06

uers

0.06

 strugg

0.06

);↵↵↵

0.06

:↵↵↵

0.06

Activations Density 0.039%

to

This neuron detects phrases expressing a personal proposal or intention—particularly first-person (“I would like to…,” “I believe we could…”) statements used when making a request or offering a collaboration.

No Comments

No Known Activations

to

This neuron detects phrases expressing a personal proposal or intention—particularly first-person (“I would like to…,” “I believe we could…”) statements used when making a request or offering a collaboration.

No Comments

No Known Activations