INDEX

Explanations

finish

np_max-act · gemini-2.0-flash

The neuron activates on words and phrases that signal the end or completion of a process or iteration (e.g., “end,” “finished”).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 tấn

-0.06

take

-0.06

 Attack

-0.06

 login

-0.06

 <?=$

-0.06

_',

-0.06

_factor

-0.06

发布

-0.06

_tags

-0.06

benef

-0.06

POSITIVE LOGITS

.major

0.07

 President

0.07

Pose

0.07

 которая

0.07

 života

0.07

 khai

0.07

مال

0.06

�

0.06

лей

0.06

.makeText

0.06

Activations Density 0.017%

finish

The neuron activates on words and phrases that signal the end or completion of a process or iteration (e.g., “end,” “finished”).

No Comments

No Known Activations

finish

The neuron activates on words and phrases that signal the end or completion of a process or iteration (e.g., “end,” “finished”).

No Comments

No Known Activations