INDEX

Explanations

refusing

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 appunto

-0.64

tagext

-0.59

ConfigureAwait

-0.54

PRNewswire

-0.52

ViewImports

-0.52

zow

-0.50

wpi

-0.50

 venons

-0.50

 dedans

-0.50

AsUp

-0.49

POSITIVE LOGITS

sex

0.52

 friendship

0.49

✨:

0.48

SuccessListener

0.48

AnchorStyles

0.48

ReusableCell

0.47

 food

0.46

<bos>

0.45

 eating

0.44

 helping

0.44

Activations Density 0.003%