INDEX

Explanations

theory

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Theory

-0.71

theory

-0.66

 theory

-0.65

kloped

-0.64

 InputDecoration

-0.60

 BoxDecoration

-0.60

 Vikipedi

-0.59

helial

-0.59

 IConfiguration

-0.59

Theory

-0.58

POSITIVE LOGITS

رشف

0.51

of

0.50

:✨

0.46

 sebaliknya

0.46

 kwanza

0.44

>{@

0.42

Seeing

0.42

 cost

0.41

iteness

0.41

 olev

0.40

Activations Density 0.005%