INDEX

Explanations

cause

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 cause

-2.05

cause

-1.84

Cause

-1.78

 Cause

-1.77

 causes

-1.73

 CAUSE

-1.69

causes

-1.58

 Causes

-1.57

 caused

-1.55

CAUSE

-1.48

POSITIVE LOGITS

the

0.82

0.74

 their

0.69

UnusedPrivate

0.63

any

0.62

 those

0.62

 this

0.62

XmlAccessType

0.59

an

0.59

 some

0.58

Activations Density 0.036%