INDEX

Explanations

ttainable

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 variation

-0.84

 crosses

-0.80

 تانيه

-0.79

variation

-0.75

 argument

-0.74

 Polynesia

-0.74

 reasoning

-0.71

argument

-0.69

 Variation

-0.69

 Argument

-0.69

POSITIVE LOGITS

né

0.63

of

0.61

ting

0.53

aira

0.52

ben

0.52

ait

0.51

aire

0.50

 both

0.49

cin

0.48

chel

0.47

Activations Density 1.677%