INDEX

Explanations

expressions of honor and gratitude

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

orden

-0.08

pard

-0.07

OUCH

-0.07

icari

-0.07

iet

-0.07

åŁŁ

-0.07

Ð»Ð¾Ð¿

-0.07

Ø³ÙĦ

-0.07

cela

-0.07

POSITIVE LOGITS

 privilege

0.08

 privileged

0.06

Ä±rÄ±

0.06

 privileges

0.06

mma

0.06

 cand

0.06

privileged

0.06

to

0.05

 Bols

0.05

 Priv

0.05

Activations Density 0.006%