INDEX

Explanations

negations and negative phrases

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ogan

-0.07

speaker

-0.07

bane

-0.07

sts

-0.06

onse

-0.06

ousel

-0.06

 æ°

-0.06

Æ¡

-0.06

Ð»ÐµÐ¼

-0.06

otive

-0.06

POSITIVE LOGITS

 alone

0.17

 Alone

0.13

alone

0.13

 sole

0.12

 saja

0.11

-alone

0.10

åĶ¯ä¸Ģ

0.09

ë¿Ĳ

0.09

 only

0.09

 seul

0.09

Activations Density 0.012%