INDEX

Explanations

negations or expressions of refusal and their contexts

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ä¾

-0.07

aldi

-0.07

Completed

-0.06

uyáº¿t

-0.06

itational

-0.06

 maximal

-0.06

áº¥p

-0.06

ìłĢ

-0.06

anical

-0.06

aversable

-0.05

POSITIVE LOGITS

apr

0.07

.Keyword

0.07

egg

0.06

msp

0.06

_DUMP

0.06

 RequestOptions

0.06

á»¡

0.06

Blank

0.06

lj

0.06

Ãªs

0.06

Activations Density 0.017%