INDEX

Explanations

Short prefixes/fragments

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

expandindo

-0.75

WriteBarrier

-0.69

 насељу

-0.67

ⓧ

-0.64

Hentet

-0.57

فحة

-0.55

RectangleBorder

-0.53

 eventdata

-0.52

ிகள்

-0.52

 Paglinawan

-0.51

POSITIVE LOGITS

pre

0.86

hit

0.68

pre

0.65

bur

0.63

Pre

0.61

Pre

0.60

 pré

0.59

per

0.59

hit

0.56

Hit

0.56

Activations Density 0.002%