INDEX

Explanations

Politics and people

np_max-act · gemini-2.0-flash

The neuron predominantly fires on person names and formal titles (i.e. named‐entity tokens referring to individuals and their offices).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

pointer

-0.07

Ha

-0.06

 pointer

-0.06

orio

-0.06

-sp

-0.06

 expertise

-0.06

occasion

-0.06

 test

-0.06

 pointers

-0.06

POSITIVE LOGITS

.visitInsn

0.06

 бух

0.06

 Profes

0.06

anceled

0.06

 Peter

0.06

 -->↵↵↵

0.06

 blatantly

0.06

 Intern

0.06

.ColumnHeadersHeightSizeMode

0.06

.."

0.06

Activations Density 0.111%

Politics and people

The neuron predominantly fires on person names and formal titles (i.e. named‐entity tokens referring to individuals and their offices).

No Comments

No Known Activations