INDEX

Explanations

Code Context

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Autoritní

-0.87

 مرئيه

-0.67

GEBURTS

-0.62

 kefir

-0.62

 metros

-0.61

Personensuche

-0.61

IsContent

-0.59

रीदारी

-0.59

 Picchu

-0.59

 whiteness

-0.59

POSITIVE LOGITS

LikeLike

0.41

urit

0.41

 anzi

0.41

extAlignment

0.40

ibration

0.39

bland

0.39

\]

0.36

dubbo

0.36

ضان

0.35

Activations Density 0.003%