INDEX

Explanations

couch

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 success

-0.81

 consistent

-0.68

consistent

-0.67

 inconsistent

-0.63

 couch

-0.59

 Muses

-0.57

-------

-0.57

 nahilalakip

-0.56

theless

-0.55

ized

-0.54

POSITIVE LOGITS

 liga

0.55

sent

0.55

iest

0.55

scar

0.55

sharing

0.54

י

0.54

UAWEI

0.53

er

0.52

shows

0.52

save

0.52

Activations Density 0.176%