INDEX

Explanations

section dividers or markers

The neuron is picking out structural markup labels—especially section, theorem, lemma, proposition, corollary, etc.—and their surrounding brackets that denote document headings or theorem‐style environments.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

in

-1.30

 during

-1.16

at

-1.09

as

-1.07

 most

-0.98

 when

-0.97

 might

-0.97

may

-0.97

 primarily

-0.96

-0.94

POSITIVE LOGITS

 renkli

1.26

 enfermos

1.23

 propuestas

1.17

 görüntüsü

1.15

:}

1.15

kussion

1.09

autres

1.07

ܙ

1.07

 questions

1.07

 results

1.06

Activations Density 0.135%