INDEX

Explanations

whether, if, and how to do

The neuron spikes on second‐person, reader‐addressing terms—especially “you” and interrogative words (who, what, whether)—i.e. question or instruction contexts directed at the reader.

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

こんばんは

-1.02

 latter

-1.01

服务的

-1.00

되지

-1.00

usually

-0.99

often

-0.98

generally

-0.98

 hardly

-0.98

couldn

-0.98

 acceptez

-0.97

POSITIVE LOGITS

 your

2.47

可能

1.55

if

1.53

you

1.47

 before

1.43

 possible

1.31

 yourself

1.25

rillation

1.23

 potential

1.16

how

1.15

Activations Density 0.025%