INDEX

Explanations

overlook

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 overlook

-1.08

ValueStyle

-0.81

 overlooking

-0.80

 overlooks

-0.79

 متعلقه

-0.72

 descu

-0.57

aspectj

-0.57

 neglect

-0.56

 disregard

-0.56

WriteBarrier

-0.53

POSITIVE LOGITS

the

0.82

 something

0.69

0.62

FTFY

0.61

<bos>

0.60

ướng

0.58

OLDS

0.57



0.57

)$_

0.57

mbols

0.57

Activations Density 0.035%