INDEX

Explanations

positive words

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Records

-0.98

 Records

-0.97

 Record

-0.91

Datuak

-0.91

 RECORDS

-0.89

 '\\;'

-0.88

records

-0.88

 Normdatei

-0.85

 estekak

-0.84

 cherchés

-0.82

POSITIVE LOGITS

 excellent

0.47

Excellent

0.43

 safety

0.40

 perfect

0.38

 Excellent

0.35

removeAttr

0.35

 flawless

0.35

ter

0.34

ab

0.33

 human

0.33

Activations Density 0.000%