INDEX

Explanations

Production

np_max-act · gemini-2.0-flash

This neuron detects section-heading labels (e.g. “Production,” “Release,” “Reception”) in film‐related documents.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 bitterness

-0.06

Scene

-0.06

 Restart

-0.06

[]*

-0.06

_AX

-0.06

 Kirk

-0.06

 shelves

-0.06

IZ

-0.06

 три

-0.06

 HACK

-0.06

POSITIVE LOGITS

rottle

0.07

$obj

0.07

Sy

0.07

πε

0.06

Sq

0.06

Because

0.06

φων

0.06

 surfaced

0.06

umptech

0.06

formData

0.06

Activations Density 0.012%

Production

This neuron detects section-heading labels (e.g. “Production,” “Release,” “Reception”) in film‐related documents.

No Comments

No Known Activations