INDEX

Explanations

The experiment

programmable code constructs

New Auto-Interp

Configuration

Prompts (Dashboard)

392,802 prompts, 256 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

Negative Logits

our

0.75

 this

0.68

เรา

0.68

 Into

0.67

 Leland

0.64

 Emily

0.63

 innov

0.61

 нашим

0.61

 우리의

0.61

 inode

0.60

POSITIVE LOGITS

も

0.73

 polémica

0.70

 رفض

0.70

。

0.68

 తీవ్ర

0.67

 کي

0.66

錯誤

0.65

защи

0.63

 ಪೊಲೀಸ

0.62

う

0.62

Activations Density 0.912%