INDEX

Explanations

symbols and separators

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

to

-1.95

 anunciado

-1.84

the

-1.80

 eenvoudig

-1.76

 with

-1.75

 Then

-1.72

 explicado

-1.70

珦

-1.68

in

-1.68

与其

-1.60

POSITIVE LOGITS

1.83

1.80

1.61

所有的

1.61

This

1.58

1.56

Our

1.53

1.51

1.50

They

1.49

Activations Density 0.003%