INDEX

Explanations

stated

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 indicated

-1.69

indicated

-1.48

 suggested

-1.46

suggested

-1.30

 demonstrated

-1.23

 indiqué

-1.23

 confirmed

-1.19

 Suggested

-1.18

 stated

-1.14

 implied

-1.12

POSITIVE LOGITS

the

0.93

0.82

 that

0.70

an

0.66

ly

0.64

 another

0.63

+#+

0.63

 something

0.60

his

0.59

 some

0.58

Activations Density 0.042%