INDEX

Explanations

phrases related to offers and commitments without obligations

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

oa

-0.08

pron

-0.07

indir

-0.07

 Gross

-0.07

ainter

-0.06

raci

-0.06

AttributeValue

-0.06

 Manual

-0.06

inals

-0.06

_MAN

-0.06

POSITIVE LOGITS

 harmless

0.12

 safe

0.11

 Harm

0.11

 without

0.11

no

0.10

Safe

0.10

 nothing

0.10

safe

0.10

æĹł

0.10

-safe

0.09

Activations Density 0.068%