INDEX

Explanations

bad followed by negative outcomes

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

迨

-3.11

茀

-2.97

菹

-2.95

噲

-2.80



-2.78

 mentes

-2.67

퐿

-2.66

阄

-2.64



-2.64

hmmm

-2.61

POSITIVE LOGITS

或

3.19

3.14

2.94

</

2.41

2.39

bo

2.36

2.34

 Virtually

2.34

</b>

2.30

Activations Density 0.008%