INDEX

Explanations

software/technical language

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Or

-0.66

And

-0.63

or

-0.59

 They

-0.59

and

-0.58

But

-0.52

OR

-0.52

We

-0.51

Or

-0.50

You

-0.48

POSITIVE LOGITS

there

0.80

it

0.79

with

0.75

without

0.75

that

0.73

this

0.72

while

0.71

instead

0.71

when

0.71

if

0.69

Activations Density 0.010%