INDEX

Explanations

you

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 were

-0.89

are

-0.72

 contain

-0.71

 deserve

-0.71

 have

-0.71

 correspond

-0.70

 belong

-0.68

 weren

-0.67

 appear

-0.67

 seem

-0.66

POSITIVE LOGITS

is

0.90

 grows

0.65

 drives

0.64

 stays

0.64

 tries

0.64

stays

0.63

 uses

0.63

 collects

0.62

 mounts

0.61

 applies

0.61

Activations Density 0.056%