INDEX

Explanations

foreign languages/names

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

PlotsExplanationShow Test FieldDefault Test Text

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 initially

-0.28

lines

-0.26

acular

-0.25

 finding

-0.25

colon

-0.25

çīĴ

-0.25

 find

-0.25

foe

-0.24

none

-0.24

 colon

-0.24

POSITIVE LOGITS

ä¸īçº§

0.28

as

0.28

åĽĽçº§

0.26

æ½ŀ

0.26

ä¸įäºĨè§£

0.26

çĻ¾åĪĨä¹ĭ

0.25

åįģäºĮæĿ¡

0.25

å¼¹

0.25

 innoc

0.25

ä½ĵè´¨

0.24

Activations Density 0.004%