INDEX

Explanations

conversational writing

np_max-act · gemini-2.0-flash

mentions of race in discussions about social behavior or attitudes.

oai_token-act-pair · gpt-4o-mini Triggered by @xinyanhu8

The neuron detects the comma immediately following a conditional demographic clause (e.g. the “If you’re a black person,” comma).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

�

-0.06

 hypoc

-0.06

 Covenant

-0.06

 conce

-0.06

ap

-0.06

анти

-0.06

 workspace

-0.06

 대부분

-0.06

ndef

-0.06

ap

-0.06

POSITIVE LOGITS

.music

0.07

shared

0.07

_BIT

0.07

 hotline

0.07

 ster

0.07

(red

0.06

 derby

0.06

Ül

0.06

:::::::::::::

0.06

 Micha

0.06

Activations Density 0.003%

conversational writing

mentions of race in discussions about social behavior or attitudes.

The neuron detects the comma immediately following a conditional demographic clause (e.g. the “If you’re a black person,” comma).

No Comments

No Known Activations