INDEX

Explanations

py

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Plane

-0.75

 Plate

-0.72

Law

-0.71

 Line

-0.71

 Leader

-0.69

 Generator

-0.69

 Frame

-0.68

 House

-0.68

 Generation

-0.68

 Stock

-0.68

POSITIVE LOGITS

 engel

0.25

 arab

0.24

dak

0.24

 russell

0.24

aimana

0.24

 blanc

0.23

 jude

0.23

lav

0.23

 mexicana

0.23

 romano

0.23

Activations Density 0.002%