INDEX

Explanations

Q&A and specific terms

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

ன்

-0.82

虞

-0.81

Попис

-0.79

яс

-0.79

tiéndose

-0.77

rimid

-0.77

fiés

-0.73

fron

-0.73

樹

-0.73

 tapes

-0.73

POSITIVE LOGITS

 embedding

1.29

 Embedding

1.05

 embed

0.96

embedding

0.94

 embedded

0.93

 cover

0.91

hiding

0.90

 secret

0.88

Embedding

0.88

robust

0.88

Activations Density 0.019%