INDEX

Explanations

terms

np_max-act · gemini-2.0-flash

This neuron fires on the instruction phrase “Collect the terms in,” especially on the word “terms” (and its neighbors “the” and “in”).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-design

-0.08

zap

-0.07

/fl

-0.07

 Flint

-0.07

Semaphore

-0.07

 เซ

-0.07

inp

-0.07

 Import

-0.07

-close

-0.06

_calls

-0.06

POSITIVE LOGITS

asi

0.06

(Html

0.06

:nth

0.06

 muestra

0.06

 hük

0.06

<source

0.05

чої

0.05

 Çünkü

0.05

ınca

0.05

 constituent

0.05

Activations Density 0.005%

terms

This neuron fires on the instruction phrase “Collect the terms in,” especially on the word “terms” (and its neighbors “the” and “in”).

No Comments

No Known Activations