INDEX

Explanations

phrases indicating internal conflicts or struggles

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Ð°Ð»Ñİ

-0.09

ALER

-0.09

qus

-0.08

.MixedReality

-0.08

edm

-0.08

 Foley

-0.08

imson

-0.07

nde

-0.07

ALAR

-0.07

bai

-0.07

POSITIVE LOGITS

 self

0.10

 himself

0.07

 themselves

0.06

 author

0.06

 Self

0.06

 owner

0.06

elf

0.06

 according

0.06

-self

0.06

 participants

0.06

Activations Density 0.095%