INDEX

Explanations

numbers followed by units or code paths

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 layout

-0.81

Layout

-0.73

ODES

-0.73

 Ве

-0.72

 Flint

-0.71

 Webb

-0.71

adra

-0.71

ellery

-0.71

 ilość

-0.69

ardino

-0.68

POSITIVE LOGITS

byl

0.73

 oranges

0.70

 cerdo

0.65

一来

0.64

 пони

0.62

ۗ

0.62

 orange

0.61

ajo

0.61

annya

0.60

 overridden

0.60

Activations Density 0.064%