INDEX

Explanations

eliminating unwanted things

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 causando

-0.90

 Schleswig

-0.88

 Drit

-0.86

enz

-0.84

 anticor

-0.83

 tropics

-0.83

 morons

-0.82

óg

-0.80

 defoli

-0.80

 rans

-0.80

POSITIVE LOGITS

 unwanted

1.84

 undesirable

1.34

 stubborn

1.32

bad

1.30

 excess

1.28

 undes

1.26

 stratég

1.15

 konkrét

1.14

 både

1.11

 Meksi

1.09

Activations Density 0.094%