INDEX

Explanations

galaxies and stars

np_max-act · gemini-2.0-flash

The neuron detects domain‐specific astrophysics terminology—especially key celestial object and process nouns (e.g. “stars,” “galaxies,” “evolution,” “clusters”) in scientific text.

oai_token-act-pair · o4-mini Triggered by @xinyanhu8

New Auto-Interp

Configuration

andyrdt/saes-llama-3.1-8b-instruct/resid_post_layer_11/trainer_1

Dataset (Dashboard)

Various

Features

131,072

Data Type

float32

Hook Name

blocks.11.hook_resid_post

Architecture

standard

Context Size

1,024

Dataset

monology/pile-uncopyrighted

Embeds

PlotsExplanationShow Test FieldDefault Test Text

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

.emp

-0.06

undler

-0.06

(condition

-0.06

 انر

-0.06

.INT

-0.06

ে

-0.06

 Conflict

-0.06

 pager

-0.06

มากมาย

-0.06

「お

-0.05

POSITIVE LOGITS

*>(&

0.07

-develop

0.07

idi

0.06

Emb

0.06

 sexually

0.06

 numerous

0.06

 TLabel

0.06

')."

0.06

',"

0.06

 orchestrated

0.06

Activations Density 0.004%

galaxies and stars

The neuron detects domain‐specific astrophysics terminology—especially key celestial object and process nouns (e.g. “stars,” “galaxies,” “evolution,” “clusters”) in scientific text.

No Comments

No Known Activations