INDEX

Explanations

obligation

np_max-act · gemini-2.0-flash

This neuron activates on modal or normative verbs expressing what ought to or should happen (e.g. should, ought).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

enville

-0.07

tau

-0.07

 included

-0.07

 上海

-0.06

interpreted

-0.06

 Пло

-0.06

وف

-0.06

Ih

-0.06

чень

-0.06

harga

-0.06

POSITIVE LOGITS

�

0.07

 раб

0.07

 پیر

0.06

 제가

0.06

 Εκ

0.06

Ag

0.06

 movement

0.06

 milestones

0.06

$user

0.06

面

0.06

Activations Density 0.096%

obligation

This neuron activates on modal or normative verbs expressing what ought to or should happen (e.g. should, ought).

No Comments

No Known Activations