INDEX

Explanations

code, tutorials

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

’”

-0.66

Portale

-0.63

?“

-0.63

 ?”

-0.62

?”,

-0.61

 betweenstory

-0.61

 nahilalakip

-0.61

stdc

-0.61

colades

-0.61

’?

-0.60

POSITIVE LOGITS

 information

0.54

 Operation

0.46

 everything

0.46

 info

0.45

Master

0.45

 masters

0.45

aí

0.44

0.43

 thorough

0.43

Body

0.43

Activations Density 0.007%