INDEX

Explanations

mean

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 mean

-1.02

ide

-0.91

 surla

-0.66

IDE

-0.63

mean

-0.54

—

-0.53

ütü

-0.52

 ideal

-0.51

bara

-0.49

 MEAN

-0.48

POSITIVE LOGITS

 Diſ

0.68

 coherence

0.66

 ſche

0.64

 Anſ

0.63

 Reſ

0.61

 purpoſe

0.61

 fubject

0.60

 pleaſure

0.60

 Monfieur

0.60

 tartalomajánló

0.60

Activations Density 0.031%