INDEX

Explanations

guard

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

-0.54

 tale

-0.52

 studio

-0.52

ت

-0.48

war

-0.46

φ

-0.45

ed

-0.45

ple

-0.44

né

-0.44

 comp

-0.44

POSITIVE LOGITS

 Majefty

1.04

LabelTagHelper

0.97

 Chriftian

0.91

 Monfieur

0.89

 pleaſure

0.89

 itſelf

0.88

 дописавши

0.88

 myſelf

0.86

 مشين

0.86

 himſelf

0.85

Activations Density 0.112%