INDEX

Explanations

I

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

PlotsExplanationShow Test FieldDefault Test Text

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 correctly

-0.36

 correct

-0.35

çŀ§

-0.35

æŃ£ç¡®çļĦ

-0.30

 hanging

-0.29

 often

-0.28

æŃ£ç¡®

-0.27

ä¸įå°ıçļĦ

-0.27

 accurate

-0.26

Ð¿ÑĢÐ¾Ð²

-0.26

POSITIVE LOGITS

éĥ½ä¸įæĺ¯

0.32

ysis

0.28

åıĪæĺ¯

0.28

éĢıæĺİ

0.28

ys

0.27

çĪĨçĤ¸

0.27

 Transparency

0.26

]][

0.26

unset

0.25

ä¸Ńæĸ°

0.25

Activations Density 0.902%