INDEX

Explanations

phrases indicating enjoyment or positive experiences

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

adu

-0.07

.Library

-0.07

.opensource

-0.06

ãĥ³ãĥ

-0.06

Ãºsqueda

-0.06

ocy

-0.06

naires

-0.06

ept

-0.06

sled

-0.06

terdam

-0.06

POSITIVE LOGITS

 freedoms

0.08

 success

0.08

 freedom

0.08

 succÃ¨s

0.08

ably

0.07

enville

0.07

iji

0.06

 status

0.06

gos

0.06

 economies

0.06

Activations Density 0.013%