INDEX

Explanations

phrases indicating the start of experiences or relationships

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 ÐŁÑĢÐ°

-0.07

retain

-0.07

 early

-0.07

borg

-0.06

ouce

-0.06

Ð»Ð°Ð³Ð¾Ð´

-0.06

 earlier

-0.06

recent

-0.06

esi

-0.06

agna

-0.06

POSITIVE LOGITS

nings

0.10

 something

0.08

/end

0.07

ä¸Ģç§į

0.07

º

0.07

 began

0.07

 Begins

0.07

 begins

0.07

 begun

0.07

Ä±t

0.06

Activations Density 0.013%