INDEX

Explanations

references to helping or providing assistance

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

Supporting

-0.79

 LUMP

-0.73

 supporting

-0.72

 Supporting

-0.72

supporting

-0.70

__(/*!

-0.66

abetes

-0.64

колай

-0.64

 חיצוניים

-0.63

dificio

-0.60

POSITIVE LOGITS

 helps

1.34

 Helps

1.18

Helps

1.13

helps

1.03

 autorytatywna

0.63

尽量

0.62

try

0.59

 helpt

0.59

 works

0.58

 improves

0.58

Activations Density 0.002%