INDEX

Explanations

fairly/fair

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

For

-0.85

FOR

-0.65

}*/

-0.59

{}",

-0.56

 otomatig

-0.53

 facilities

-0.50

']){

-0.50

'][]

-0.50

منى

-0.50

 Fair

-0.50

POSITIVE LOGITS

 purpoſe

1.06

 Theſe

1.02

 ſtate

0.99

 themſelves

0.96

 becauſe

0.95

 ſeveral

0.95

 ſhe

0.94

 faſt

0.93

 theſe

0.92

 whoſe

0.92

Activations Density 0.262%