INDEX

Explanations

calls to cease harmful or unwanted actions

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

arya

-0.06

agina

-0.06

utschein

-0.06

olds

-0.06

-archive

-0.06

ricks

-0.06

yonel

-0.06

imates

-0.06

athom

-0.05

abler

-0.05

POSITIVE LOGITS

 Vulcan

0.06

niÅ¾

0.06

à¥įà¤°à¤¯

0.06

csi

0.06

aken

0.06

otu

0.06

 being

0.06

ertime

0.06

_feed

0.06

 TCHAR

0.06

Activations Density 0.010%