INDEX

Explanations

words related to being unwanted or undesirable

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

cerebras/SlimPajama-627B

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ignKey

-0.07

sein

-0.07

aeper

-0.07

senal

-0.07

Ã½Å¡

-0.07

stagram

-0.07

point

-0.06

bero

-0.06

tempts

-0.06

zon

-0.06

POSITIVE LOGITS

unw

0.06

Uns

0.06

wel

0.06

owell

0.06

 Coil

0.06

emachine

0.06

izza

0.06

ingt

0.06

ly

0.06

Įĵ

0.06

Activations Density 0.001%