INDEX

Explanations

as

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

PlotsExplanationShow Test FieldDefault Test Text

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

for

-0.32

for

-0.28

 represented

-0.28

 indicated

-0.28

 roma

-0.27

åĽºå®ļçļĦ

-0.26

cea

-0.26

ç§°

-0.26

å¯Ħ

-0.24

uru

-0.24

POSITIVE LOGITS

metic

0.31

éĥ¨åĪĨ

0.30

ä¸Ģæĸ¹

0.29

“

0.28

@(

0.28

 part

0.28

“

0.26

 parte

0.26

pling

0.25

 parts

0.25

Activations Density 0.896%