INDEX

Explanations

references to specific highlighted points or noteworthy information within a text

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Rhestr

-0.85

🏻‍♀️

-0.66

 kasarigan

-0.65

 utafitiHapana

-0.63

BibitemShut

-0.63

՚

-0.63

AllAfrica

-0.60

boten

-0.59

ConstraintMaker

-0.59

‪

-0.58

POSITIVE LOGITS

RegressionTest

0.68

yyl

0.68

<eos>

0.65

 élas

0.64

참고

0.61

 Vikipedi

0.59

Transkript

0.56

 PyLong

0.56

 femininas

0.55

UnusedPrivate

0.54

Activations Density 0.385%