INDEX

Explanations

Causes / affected

New Auto-Interp

Configuration

Prompts (Dashboard)

16,384 prompts, 128 tokens each

Dataset (Dashboard)

monology/pile-uncopyrighted

Embeds

IFrame

Link

Not in Any Lists

No Comments

Negative Logits

 causes

-1.52

 Causes

-1.41

Causes

-1.41

causes

-1.41

 Ursachen

-0.95

 affected

-0.92

 causas

-0.90

 Affected

-0.86

affected

-0.84

cause

-0.83

POSITIVE LOGITS

awtextra

0.75

#+#

0.63

unnitel

0.62

jątk

0.54

 Staaten

0.52

phazard

0.50

 rush

0.50

 Riesen

0.49

umbag

0.49

 paljon

0.49

Activations Density 0.034%