INDEX

Explanations

research and modifications

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ſelf

-1.17

expandindo

-1.16

 itſelf

-1.14

 Jefus

-1.12

 purpoſe

-1.09

 myſelf

-1.08

 ſche

-1.05

 poffible

-1.05

 pleaſure

-1.05

ſelves

-1.03

POSITIVE LOGITS

of

0.64

is

0.60

may

0.57

an

0.54

and

0.53

can

0.52

id

0.52

ius

0.51

0.49

Activations Density 0.102%