INDEX

Explanations

aka, Can, Say, Difficult

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Waga

-1.63

orienti

-1.50

 okolo

-1.45

 作り方

-1.41

zucht

-1.41

서울

-1.40

 ambassade

-1.38

rildi

-1.37

colgante

-1.36

Sair

-1.36

POSITIVE LOGITS

 only

2.13

 then

1.87

 also

1.80

 like

1.69

 most

1.59

 didn

1.49

 mostly

1.42

 biggest

1.41

 probably

1.39

 usually

1.37

Activations Density 0.145%