INDEX

Explanations

the word

The neuron activates when the text is talking about a word itself—especially in metalinguistic phrases like “the word ‘…’.”

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

 hausse

-1.24

oping

-1.15

extremely

-1.05

 chuť

-1.03

();

-1.02

 MAKING

-1.01

荭

-1.00

ourced

-0.99

really

-0.99

].

-0.97

POSITIVE LOGITS

 meticulous

1.02

 immaculate

0.95

 substantial

0.95

 suggests

0.93

单词

0.92

 flimsy

0.90

 entails

0.90

 encompasses

0.88

 nestled

0.86

 daunting

0.86

Activations Density 0.036%