INDEX

Explanations

code and copyright

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

PlotsExplanationShow Test FieldDefault Test Text

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 whether

-0.42

æĺ¯åĲ¦

-0.41

 Whether

-0.41

whether

-0.40

Whether

-0.38

æĺ¯åĲ¦æľī

-0.37

 WHETHER

-0.36

ä¸İåĲ¦

-0.34

bool

-0.30

HING

-0.30

POSITIVE LOGITS

@Json

0.26

ihar

0.26

åŁĶ

0.26

 physically

0.25

 Universe

0.24

åīĲ

0.23

 censorship

0.23

 poll

0.23

çļĦæĢ§æł¼

0.23

 gradually

0.23

Activations Density 0.889%