INDEX

Explanations

News articles

np_max-act · gemini-2.0-flash

The neuron activates on evaluative or opinion-bearing words (the kind of adverbs, adjectives, and modals that mark subjective commentary).

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

.nombre

-0.07

 farewell

-0.07

.Format

-0.06

Kag

-0.06

DONE

-0.06

rè

-0.06

 annotations

-0.06

 budgets

-0.06

Saudi

-0.06

hani

-0.06

POSITIVE LOGITS

 上涨

0.07

ательных

0.06

 NAFTA

0.06

 meses

0.06

        ↵↵

0.06

 cần

0.06

 wished

0.06

 })();↵

0.06

 всем

0.06

 properly

0.06

Activations Density 0.023%

News articles

The neuron activates on evaluative or opinion-bearing words (the kind of adverbs, adjectives, and modals that mark subjective commentary).

No Comments

No Known Activations