INDEX

Explanations

phrases expressing enjoyment or positive experiences

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

rage

-0.06

AKE

-0.06

/scripts

-0.06

 Lage

-0.06

shr

-0.06

lassen

-0.06

wit

-0.05

gger

-0.05

wa

-0.05

lay

-0.05

POSITIVE LOGITS

BOSE

0.08

oriously

0.07

itesse

0.07

stras

0.07

_barrier

0.06

oenix

0.06

elines

0.06

eus

0.06

.sel

0.06

/sn

0.06

Activations Density 0.007%