INDEX

Explanations

introduces statements with "So"

New Auto-Interp

Configuration

Prompts (Dashboard)

24,576 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

is

-2.06

-1.74

-1.73

-1.70

of

-1.67

↵↵

-1.58

 acercarse

-1.58

潆

-1.53

You

-1.53

 และ

-1.52

POSITIVE LOGITS

⸰

1.93

 infal

1.91

۰۰

1.81

 esat

1.80

 monstru

1.80

 tremend

1.73

 concr

1.73

 precau

1.66

㊗

1.66

 metic

1.65

Activations Density 0.023%