INDEX

Explanations

of

np_max-act · gemini-2.0-flash

The neuron fires on words or short phrases that signal a speaker’s subjective stance or evaluation (e.g. “hard,” “really,” “anyone who knows me,” “by now”).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

mites

-0.08

КИ

-0.07

 afterward

-0.06

OSE

-0.06

 Covent

-0.06

 />↵

-0.06

TT

-0.06

 bahis

-0.06

POSITIVE LOGITS

 Satisfaction

0.06

cez

0.06

.Accessible

0.06

larıyla

0.06

 Müslüman

0.06

 azal

0.06

jectory

0.06

 Calgary

0.06

]=$

0.06

 retiring

0.06

Activations Density 0.166%

of

The neuron fires on words or short phrases that signal a speaker’s subjective stance or evaluation (e.g. “hard,” “really,” “anyone who knows me,” “by now”).

No Comments

No Known Activations

of

The neuron fires on words or short phrases that signal a speaker’s subjective stance or evaluation (e.g. “hard,” “really,” “anyone who knows me,” “by now”).

No Comments

No Known Activations