INDEX

Explanations

pairs followed by respectively

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 Küsten

-0.82

 whet

-0.82

Might

-0.82

ish

-0.81

椀

-0.79

可能会

-0.79

žní

-0.78

 vissa

-0.77

สอง

-0.76

ngdoc

-0.76

POSITIVE LOGITS

bows

0.98

لاب

0.90

 territ

0.89

DAYS

0.89

kpop

0.87

gernaut

0.85

🏅

0.83

babies

0.82

 bestemt

0.82

続けて

0.82

Activations Density 0.019%