INDEX

Explanations

words related to cautionary statements or alerts

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

/by

-0.06

mos

-0.06

alam

-0.06

Ale

-0.06

/from

-0.06

gu

-0.06

ien

-0.06

 Buckley

-0.06

an

-0.05

ucha

-0.05

POSITIVE LOGITS

ingly

0.08

ļĮ

0.08

exion

0.08

0.07

TOTYPE

0.07

-about

0.07

 ráº±ng

0.07

eware

0.07

 Danger

0.07

APPER

0.07

Activations Density 0.004%